您的位置：首页 > 编程语言 > Python开发

python读入中文文本编码错误

2016-12-06 21:02 218 查看

python读入中文文本编码错误

python读入中文txt文本：

#coding:utf-8

def readFile():
fp = open('emotion_dict//neg//neg_all_dict.txt','r')
list = []
for line in fp:
list.append(line)
fp.close()
print(list)
readFile()

但是有时候会出现错误提示：
UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 10: illegal multibyte sequence

此时，需要对代码做一个小的调整，就可以读入中文，即以中文二进制'rb'读入txt，然后转换为'utf-8'，具体代码如下：

#coding:utf-8

def readFile():
fp = open('emotion_dict//neg//neg_all_dict.txt','rb')
list = []
for line in fp.readlines():
line = line.strip()
line = line.decode('utf-8')
list.append(line)
fp.close()
print(list)
readFile()

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： UnicodeDecodeError g utf-8 二进制

相关文章推荐

新的分享

章节导航