您的位置:首页 > 编程语言 > Python开发

python解决urllib2乱码问题

2014-10-17 16:27 246 查看
举例:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib
import urllib2

def main():
url = "http://www.douban.com"
#浏览器头
headers = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}
req = urllib2.Request(url=url,headers=headers)
data = urllib2.urlopen(req).read()
print data
return 0

if __name__ == '__main__':
main()


打印出来的内容中,汉字为乱码。解决方法:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import urllib
import urllib2
import sys
type = sys.getfilesystemencoding()

def main():
url = "http://www.douban.com"
#浏览器头
headers = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'}
req = urllib2.Request(url=url,headers=headers)
data = urllib2.urlopen(req).read()
print data.decode("UTF-8").encode(type)
return 0

if __name__ == '__main__':
main()


有关python的encode和decode用法参见:http://blog.csdn.net/xyw_blog/article/details/40188037

本文为xyw_Eliot原创,转载请注明出处:http://blog.csdn.net/xyw_blog/article/details/40187913
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python urllib2 乱码