您的位置：首页 > 其它

小爬虫——下载给定网页上的图片

2014-01-24 22:54 218 查看

来一个小的程序，学习一下正则表达式和urllib库的运用

转自：http://blog.csdn.net/u011249248/article/category/1476523

# -*- coding: utf-8 -*-

import re
import urllib

def getHtml(url):
#找出给出网页的源码
page = urllib.urlopen(url)
html = page.read()
return html

def getImg(html):
#正则
reg = r'src="(.*?\.jpg)"'
#编译正则
imgre = re.compile(reg)
#找出图片地址
imglist = re.findall(imgre,html)
#循环遍历
x = 0
for i in imglist:
urllib.urlretrieve(i,'%s.jpg' % x)
x+=1

html = getHtml(r'http://www.renren.com/')
getImg(html)

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航

添加评论
分享网址
分享文章
返回顶部