您的位置:首页 > 编程语言 > Python开发

python抓取链接并下载(无需认证)

2015-10-27 20:43 447 查看
Python 2.7.10 (default, May 23 2015, 09:40:32) [MSC v.1500 32 bit (Intel)] on win32

Type "copyright", "credits" or "license()" for more information.

>>> import urllib2,urllib

>>> url = 'http://blog.pythonlibrary.org/wp-content/uploads/2012/06/'

>>> cunchu='C:\Users\Administrator\Desktop\python-test-xiazai'

>>> req = urllib2.Request(url)

>>> content = urllib2.urlopen(req).read()

>>> import re

>>> match = re.compile(r'(?<=href=["]).*?\.zip(?=["])')

>>> rawlv2 = re.findall(match,content)

>>> print rawlv2

['form_submission.zip', 'wxDbViewer.zip', 'wxDnD.zip']

>>> import os

>>> for x in rawlv2:

cunurl=os.path.join(cunchu,x)

urllib.urlretrieve(url+x, cunurl)

('C:\\Users\\Administrator\\Desktop\\python-test-xiazai\\form_submission.zip', <httplib.HTTPMessage instance at 0x01E25E18>)

('C:\\Users\\Administrator\\Desktop\\python-test-xiazai\\wxDbViewer.zip', <httplib.HTTPMessage instance at 0x01E573A0>)

('C:\\Users\\Administrator\\Desktop\\python-test-xiazai\\wxDnD.zip', <httplib.HTTPMessage instance at 0x01E25D78>)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: