您的位置:首页 > 编程语言 > Python开发

Python 爬虫入门3种方法

2017-08-31 15:27 169 查看
Python 2.0

url = "http://www.baidu.com"
print '第一种方法'
response1 = urllib2.urlopen(url)
print response1.getcode()
print len(response1.read())

print '第二种方法'
request = urllib2.Request(url)
request.add_header("user-agent","Mozilla/5.0")
response2 = urllib2.urlopen(request)
print response2.getcode()
print len(response2.read())

print '第三种方法'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)
response3 = urllib2.urlopen(url)
print response3.getcode()
print cj print
response3.read()


Python 3.0

第一种方法
import urllib.request
import http.cookiejar

url="http://www.baidu.com"

print('第一种方法:')
response1 = urllib.request.urlopen(url)

print(response1.getcode())
print(len(response1.read()))

print('第二种方法')
request = urllib.request.Request(url)
request.add_header('user-agent','Mozilla/5.0')
response2 =urllib.request.urlopen(request)
print(response1.getcode())
print(len(response2.read()))

print('第三种方法')
cj = http.cookiejar.CookieJar()
opener= urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
urllib.request.install_opener(opener)
response3 =urllib.request.urlopen(url)
print(response3.getcode())
print(cj)
print(response3.read())


参考:http://www.imooc.com/article/16363
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: