Python2.7:UnicodeDecodeError :'gb2312' codec can't decode bytes:illegal multibyte sequence
2017-11-16 00:00
567 查看
Python版本:2.7
IDE:Pycharm2017
报错原因:爬虫一些古老的页面时,解码编码为UTF-8时发生乱码情况,使用GB2312解码进行UTF-8编码时爆发异常,无法完成编码。查询页面原始编码还恰好为GB2312。一头雾水之下开始百度,发现页面中如果少量包含GB2312之外的字符也是可以的,需要使用GB18030去解码,然后编码成UTF-8。具体代码如下:
IDE:Pycharm2017
报错原因:爬虫一些古老的页面时,解码编码为UTF-8时发生乱码情况,使用GB2312解码进行UTF-8编码时爆发异常,无法完成编码。查询页面原始编码还恰好为GB2312。一头雾水之下开始百度,发现页面中如果少量包含GB2312之外的字符也是可以的,需要使用GB18030去解码,然后编码成UTF-8。具体代码如下:
string.decode('GB18030').encode('utf-8')
相关文章推荐
- Python2.7:UnicodeDecodeError :'gb2312' codec can't decode bytes:illegal multibyte sequence
- Python2.7:UnicodeDecodeError :'gb2312' codec can't decode bytes:illegal multibyte sequence
- 【python问题解决】UnicodeDecodeError :'gb2312' codec can't decode bytes:illegal multibyte sequence
- 【Python】Python读取文件报错:UnicodeDecodeError: 'gbk' codec can't decode byte 0x99 in position 20: illegal multibyte sequence
- Python读取CSV文件:UnicodeDecodeError: 'gbk' codec can't decode byte 0xba ....illegal multibyte sequence
- UnicodeDecodeError: 'gbk' codec can't decode bytes in position 12-13: illegal multibyte sequence
- UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 30738: illegal multibyte sequence
- UnicodeDecodeError: 'gb2312' codec can't decode byte 0x88 in position 164111: illegal multibyte sequ
- Python中遇到"UnicodeDecodeError: ‘gbk’ codec can’t decode bytes in position 2-3: illegal multibyte sequ
- Python错误 'gbk' codec can't decode byte 0x80 in position 0: illegal multibyte sequence
- Python2.7 pip编码错误UnicodeDecodeError: 'utf8' codec can't decode byte 0xb0 in解决方法
- UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3: illegal multibyte sequence、
- ‘gbk' codec can't decode bytes in position 31023: illegal multibyte sequence
- 【UnicodeDecodeError: '' codec can't decode bytes in position : illegal multibyte sequence】
- 解决UnicodeDecodeError: ‘gbk’ codec can’t decode bytes in position 2-3: illegal multibyte sequence
- python错误:UnicodeDecodeError: 'gbk' codec can't decode byte 0x94 in position 802
- Python中遇到"UnicodeDecodeError: ‘gbk’ codec can’t decode bytes in position 2-3: illegal multibyte sequ
- UnicodeDecodeError: ‘XXX’ codec can’t decode bytes in position 2-5: illegal multibyte sequence
- python | 读文件编码问题 | UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 34: illegal mu
- Python中遇到"UnicodeDecodeError: ‘gbk’ codec can’t decode bytes in position 2-3: illegal multibyte sequ