您的位置:首页 > 其它

UnicodeDecodeError: 'gb2312' codec can't decode byte 0x88 in position 164111: illegal multibyte sequ

2017-10-10 14:35 489 查看

使用python遇到UnicodeDecodeError: 'gb2312' codec can't decode byte 0x88 in position 164111: illegal multibyte sequence

# 基金抓取
from urllib import request
import chardet

page1_url = "http://fund.eastmoney.com/fund.html"
def getHtml(pageUrl):
response = request.urlopen(pageUrl)
raw_html = response.read()
getEncoding = chardet.detect(raw_html)['encoding']
src = raw_html.decode(getEncoding)
print(src)

getHtml(page1_url)


这么办?大概意思是 网页有 非法字符你需要加上ignore

# 基金抓取
from urllib import request
import chardet

page1_url = "http://fund.eastmoney.com/fund.html"
def getHtml(pageUrl):
response = request.urlopen(pageUrl)
raw_html = response.read()
getEncoding = chardet.detect(raw_html)['encoding']
src = raw_html.decode(getEncoding, 'ignore')
print(src)

getHtml(page1_url)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐