您的位置：首页 > 编程语言 > Python开发

python打造百行代码实现简单的下载器

2013-06-03 17:34 881 查看

1.前序：

闲来没事，实在蛋疼，拿下网传的一个pdf教程学习了一下python，方法很简单：

1.把里面源代码认真鉴赏。

2.把里面所以代码例子通通手打一遍。（注：有时候我也直接照着教程敲代码）

学习完成之后，浑身奇痒难止，总想做点什么，于是萌发了写一个简单下载器的想法。

2.下载器功能：

1.能对url链接进行下载，例如：mirror.bjtu.edu.cn/gnu/wget/wget-1.10.1.tar.gz，文件名称是wget-1.10.1.tar.gz。

2.下载前获得要下载文件的文件大小。

3.当下载链接断开时可以自动重连。

4.支持断点续传。

5.记录实时下载速度。

3.下载器实现思路：

1.对要下载的url生成一个Http链接对象，发送一个‘GET’请求要下载的文件资源。

2.在http链接对象返回时，可通过http协议的响应头中的‘Content-Length’信息获得文件大小。

3.使用一个while循环不断请求下载，直到文件已经下载完成为止。

4.当发送‘GET’请求时，http协议的请求头可加入‘Range : bytes=%d-%d’字段，指定要下载的文件起始点和终止点。

5.下载速度=下载间隔大小/下载间隔时间（一般是3s，也可能比3s多），单位是(Byte/s)。

4.下载器代码：

废话闲叙，小二上代码：

#Filename:single-download.py
#(the egg hurted man):dodng12@163.com

import http.client
from time import time
from os.path import isfile
from os import stat

def GetUrlFileSize(URL):
    try:
#从url中得到host和文件path
        url=str(URL)
        index=url.find('/')
        if(index == -1):
            print('url is invalid,cannot parse')
            return -1
        else:
            host=url[0:index]
            path=url[index:]
        print('The url is %s,host is %s,path is %s'%(URL,host,path))
        conn=http.client.HTTPConnection(host)
        print('Connection have be created')
        conn.request('GET',path)
        res=conn.getresponse()
#得到需要下载的文件大小
        filesum=int(res.getheader('Content-Length'))
    except HTTPException as httpx:
        print(httpx)
    except:
        print('Some exception occured')
    finally:
        conn.close()
        return filesum
    
def DownloadURL(URL,BEGIN,OFFSET,FILE,FILE_BEGIN):
    try:
        url=str(URL)
        index=url.find('/')
        if(index == -1):
            print('url is invalid,cannot parse')
            return -1
        else:
            host=url[0:index]
            path=url[index:]
        print('The url is %s,host is %s,path is %s'%(URL,host,path))
        import http.client
        from time import time
        conn=http.client.HTTPConnection(host)
        print('Connection have be created')
        heads={"Range":"bytes=%d-%d"%(BEGIN,OFFSET)}
        conn.request('GET',path,"",heads)
        res=conn.getresponse()
#保存下载内容到文件中
        f=open(FILE,'ab')
        f.seek(FILE_BEGIN,0) 
        buffer=bytearray(2048)
        total=0
        interval_total=0
        begin_time=time()
        while not res.closed:
            num=res.readinto(buffer)
            total=total + num
            interval_total=interval_total + num
            if(num == 0):
                print('download over')
                break
            f.write(buffer[:num])
            interval_time=int(time()-begin_time)
#每3秒记录一次下载速度
            if interval_time >=3:
                print('download now speed:%.2f(B/S)'%float(interval_total/interval_time))
                interval_total=0
                begin_time=time()
        f.close()
        conn.close()
    except HTTPException as httpx:
        print(httpx)
        return -1
    except:
        print('Some No handled Error occurred')
    finally:
        return total

url='mirror.bjtu.edu.cn/gnu/wget/wget-1.10.1.tar.gz'
#url='wt.onlinedown.net/down/MTV2012_395023.rar'
#从url获得文件名
index=url.rfind('/')
if(index != -1):
    filename=url[(index+1):]
    filesize=GetUrlFileSize(url)
    if isfile(filename):
        download_num=stat(filename).st_size
        print('\n\nbreakpoint:%d continue downloading...\n\n'%(download_num))
    else:
        download_num=0
        print('\n\nfirst downloading...\n\n')
    while download_num < filesize:
        download_num = download_num + DownloadURL(url,download_num,filesize,filename,download_num) 
        print('---Execute over download %d sum:%d'%(download_num,filesize))
else:
    print('url parse failed')

5.下载器改进思路：

1.改用多进程或多线程，并行下载。

2.使用别的链接下载，前提是获得要下载文件的摘要。（还不知道怎么搞，忘大侠指点）。

6.后记：

1.望大家多多交流，指出不足或疏漏的地方，一起提高。

2.有好的点子或改进思路，望大家慷慨指出。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航