您的位置：首页 > 编程语言 > Python开发

python爬取网页时去除html标签（如nbsp）

2015-12-08 17:25 686 查看

import HTMLParser
import urllib2

response = urllib2.urlopen(url)
html = response.read().decode('utf-8')
html_parser = HTMLParser.HTMLParser()
data = html_parser.unescape(html)

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航

添加评论
分享网址
分享文章
返回顶部