python 多线程编程总结(实验多线程判断网址是否在线)
2014-10-30 09:28
1031 查看
现在做一个针对网址是否在线的判断实验,利用多线程和普通方法来进行对比,以下为代码和代码结果:
一,不使用多线程,代码如下:
#encoding:utf-8
import threading
import urllib2
def online(url = ''):
"""判断网址是否在线"""
req = urllib2.Request(url)
try:
response=urllib2.urlopen(req)
if response.code == 200:
print response.geturl(),' this url is online'
else:
print 'not'
except urllib2.URLError as e:
if hasattr(e, 'reason'):
print url,' We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print url,' The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
def main():
url_list = ['http://www.baidu.com','http://www.hitwh.edu.cn','http://www.13.com','http://www.ifeng.com','http://www.sina.com',
'http://www.wewin.com.gr/2','http://www.ifeng.com','http://www.sina.com','http://www.zeeif.com/int/',
'http://www.zeeif.com/websc/verification/',
'http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html',
'http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID',
'http://paypel-login-resolution-center.propesage-algerie.com/ID/',
'http://radiotransilvania.ro/clujarena/rena.php',
'http://kuleteknik.net/wp-includes/lol3.html',
'http://kuleteknik.net/wp-includes/lol2.html'
]
for url in url_list:
#t = threading.Thread(target = online,args = (url,))
#t.start()
online(url)
if __name__ == '__main__':
main()
结果如下:
http://www.baidu.com this url is online
http://www.hitwh.edu.cn this url is online
http://www.13.com We failed to reach a server.
Reason: [Errno 11001] getaddrinfo failed
http://www.ifeng.com this url is online
http://www.sina.com.cn/ this url is online
http://www.wewin.com.gr/2 We failed to reach a server.
Reason: Unauthorized
http://www.ifeng.com this url is online
http://www.sina.com.cn/ this url is online
http://www.zeeif.com/int/ We failed to reach a server.
Reason: Not Found
http://www.zeeif.com/websc/verification/ We failed to reach a server.
Reason: Not Found
http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html We failed to reach a server.
Reason: Internal Server Error
http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID this url is online
http://paypel-login-resolution-center.propesage-algerie.com/ID/ this url is online
http://radiotransilvania.ro/clujarena/rena.php We failed to reach a server.
Reason: Not Found
http://kuleteknik.net/wp-includes/lol3.html this url is online
http://kuleteknik.net/wp-includes/lol2.html this url is online
[Finished in 5.2s]
解释:使用了5.2秒,若判断网址更多,并且其中没有在线的网址更多时,时间会更长
二、使用多线程判断,代码如下:
#encoding:utf-8
import threading
import urllib2
def online(url = ''):
"""判断网址是否在线"""
req = urllib2.Request(url)
try:
response=urllib2.urlopen(req)
if response.code == 200:
print response.geturl(),' this url is online'
else:
print 'not'
except urllib2.URLError as e:
if hasattr(e, 'reason'):
print url,' We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print url,' The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
def main():
url_list = ['http://www.baidu.com','http://www.hitwh.edu.cn','http://www.13.com','http://www.ifeng.com','http://www.sina.com',
'http://www.wewin.com.gr/2','http://www.ifeng.com','http://www.sina.com','http://www.zeeif.com/int/',
'http://www.zeeif.com/websc/verification/',
'http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html',
'http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID',
'http://paypel-login-resolution-center.propesage-algerie.com/ID/',
'http://radiotransilvania.ro/clujarena/rena.php',
'http://kuleteknik.net/wp-includes/lol3.html',
'http://kuleteknik.net/wp-includes/lol2.html'
]
for url in url_list:
t = threading.Thread(target = online,args = (url,))
t.start()
#online(url)
if __name__ == '__main__':
main()
结果如下:
http://www.baidu.com this url is online
http://www.ifeng.com this url is online
http://www.13.com We failed to reach a server.
Reason: [Errno 11001] getaddrinfo failed
http://www.ifeng.com this url is online
http://paypel-login-resolution-center.propesage-algerie.com/ID/ this url is online
http://www.hitwh.edu.cn this url is online
http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID this url is online
http://www.sina.com.cn/ this url is online
http://www.sina.com.cn/ this url is online
http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html We failed to reach a server.
Reason: Internal Server Error
http://www.zeeif.com/websc/verification/ We failed to reach a server.
Reason: Not Found
http://www.zeeif.com/int/ We failed to reach a server.
Reason: Not Found
http://kuleteknik.net/wp-includes/lol2.html this url is online
http://kuleteknik.net/wp-includes/lol3.html this url is online
http://www.wewin.com.gr/2 We failed to reach a server.
Reason: Unauthorized
http://radiotransilvania.ro/clujarena/rena.php We failed to reach a server.
Reason: Not Found
[Finished in 1.7s]
解释:每一个网址判断都使用一个线程执行,时间只用了1.7s
总结:
1、当判断的网址多时,数量级达到百万级,多线程的优势会显现的非常大。
2、该多线程代码是为每一个网址创建一个线程,当网址过多时,很显然这个方法不行,所以可以优化该判断代码。
3、当网址存在数据库中时候,如何高效存入数据库,也是很重要的方法。
4、上面判断网址是否在线的函数,个人觉得不是非常正确,因为网址重定向的问题,网址可能不存在,但是重定向后,显示网址还存在,这也是以后改进方法,有改进办法的同学可以跟我留言,共同进步,如果我有方法,也会在博客公开。
更新(2014.10.30)
1、使用pycurl检测url是否在线,效率更高。
2、将其连接数据库,并且将结果存入数据库(自己做的小项目,已经完成)
一,不使用多线程,代码如下:
#encoding:utf-8
import threading
import urllib2
def online(url = ''):
"""判断网址是否在线"""
req = urllib2.Request(url)
try:
response=urllib2.urlopen(req)
if response.code == 200:
print response.geturl(),' this url is online'
else:
print 'not'
except urllib2.URLError as e:
if hasattr(e, 'reason'):
print url,' We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print url,' The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
def main():
url_list = ['http://www.baidu.com','http://www.hitwh.edu.cn','http://www.13.com','http://www.ifeng.com','http://www.sina.com',
'http://www.wewin.com.gr/2','http://www.ifeng.com','http://www.sina.com','http://www.zeeif.com/int/',
'http://www.zeeif.com/websc/verification/',
'http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html',
'http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID',
'http://paypel-login-resolution-center.propesage-algerie.com/ID/',
'http://radiotransilvania.ro/clujarena/rena.php',
'http://kuleteknik.net/wp-includes/lol3.html',
'http://kuleteknik.net/wp-includes/lol2.html'
]
for url in url_list:
#t = threading.Thread(target = online,args = (url,))
#t.start()
online(url)
if __name__ == '__main__':
main()
结果如下:
http://www.baidu.com this url is online
http://www.hitwh.edu.cn this url is online
http://www.13.com We failed to reach a server.
Reason: [Errno 11001] getaddrinfo failed
http://www.ifeng.com this url is online
http://www.sina.com.cn/ this url is online
http://www.wewin.com.gr/2 We failed to reach a server.
Reason: Unauthorized
http://www.ifeng.com this url is online
http://www.sina.com.cn/ this url is online
http://www.zeeif.com/int/ We failed to reach a server.
Reason: Not Found
http://www.zeeif.com/websc/verification/ We failed to reach a server.
Reason: Not Found
http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html We failed to reach a server.
Reason: Internal Server Error
http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID this url is online
http://paypel-login-resolution-center.propesage-algerie.com/ID/ this url is online
http://radiotransilvania.ro/clujarena/rena.php We failed to reach a server.
Reason: Not Found
http://kuleteknik.net/wp-includes/lol3.html this url is online
http://kuleteknik.net/wp-includes/lol2.html this url is online
[Finished in 5.2s]
解释:使用了5.2秒,若判断网址更多,并且其中没有在线的网址更多时,时间会更长
二、使用多线程判断,代码如下:
#encoding:utf-8
import threading
import urllib2
def online(url = ''):
"""判断网址是否在线"""
req = urllib2.Request(url)
try:
response=urllib2.urlopen(req)
if response.code == 200:
print response.geturl(),' this url is online'
else:
print 'not'
except urllib2.URLError as e:
if hasattr(e, 'reason'):
print url,' We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'):
print url,' The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
def main():
url_list = ['http://www.baidu.com','http://www.hitwh.edu.cn','http://www.13.com','http://www.ifeng.com','http://www.sina.com',
'http://www.wewin.com.gr/2','http://www.ifeng.com','http://www.sina.com','http://www.zeeif.com/int/',
'http://www.zeeif.com/websc/verification/',
'http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html',
'http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID',
'http://paypel-login-resolution-center.propesage-algerie.com/ID/',
'http://radiotransilvania.ro/clujarena/rena.php',
'http://kuleteknik.net/wp-includes/lol3.html',
'http://kuleteknik.net/wp-includes/lol2.html'
]
for url in url_list:
t = threading.Thread(target = online,args = (url,))
t.start()
#online(url)
if __name__ == '__main__':
main()
结果如下:
http://www.baidu.com this url is online
http://www.ifeng.com this url is online
http://www.13.com We failed to reach a server.
Reason: [Errno 11001] getaddrinfo failed
http://www.ifeng.com this url is online
http://paypel-login-resolution-center.propesage-algerie.com/ID/ this url is online
http://www.hitwh.edu.cn this url is online
http://login-resolution-center-case-475ec2aec1br.propesage-algerie.com/ID this url is online
http://www.sina.com.cn/ this url is online
http://www.sina.com.cn/ this url is online
http://mjgds.org/classrooms/wp-content/plugins/10421312312/19890907.html We failed to reach a server.
Reason: Internal Server Error
http://www.zeeif.com/websc/verification/ We failed to reach a server.
Reason: Not Found
http://www.zeeif.com/int/ We failed to reach a server.
Reason: Not Found
http://kuleteknik.net/wp-includes/lol2.html this url is online
http://kuleteknik.net/wp-includes/lol3.html this url is online
http://www.wewin.com.gr/2 We failed to reach a server.
Reason: Unauthorized
http://radiotransilvania.ro/clujarena/rena.php We failed to reach a server.
Reason: Not Found
[Finished in 1.7s]
解释:每一个网址判断都使用一个线程执行,时间只用了1.7s
总结:
1、当判断的网址多时,数量级达到百万级,多线程的优势会显现的非常大。
2、该多线程代码是为每一个网址创建一个线程,当网址过多时,很显然这个方法不行,所以可以优化该判断代码。
3、当网址存在数据库中时候,如何高效存入数据库,也是很重要的方法。
4、上面判断网址是否在线的函数,个人觉得不是非常正确,因为网址重定向的问题,网址可能不存在,但是重定向后,显示网址还存在,这也是以后改进方法,有改进办法的同学可以跟我留言,共同进步,如果我有方法,也会在博客公开。
更新(2014.10.30)
1、使用pycurl检测url是否在线,效率更高。
2、将其连接数据库,并且将结果存入数据库(自己做的小项目,已经完成)
相关文章推荐
- 多线程判断用户是否在线(后台运行ping脚本)
- Python实现判断一个字符串是否包含子串的方法总结
- JAVA中判断某详细信息列表中是否有空项(经验总结)
- Python Tips 01 : 判断两个文件是否相同
- 用PHP判断用户是否在线的方法
- ASP.NET判断用户是否在线
- timage组件判断是否有图片小问题,高手进,在线等.....
- 如何判断当前是否在线!
- 判断QQ是否在线(JS代码)
- 多线程安全的单例代码中,为何要两次判断是否为null
- python 判断一个进程是否存在
- 拨号、断网、枚举连接名称,判断是否在线、连接方式。
- 判断用户是否在线
- 判断Web服务器是否在线的代码
- 判断是否为有效网址
- 判断计算机是否在线
- 拨号、断网、枚举连接名称,判断是否在线、连接方式
- 问题总结:判断MS SQLSERVER临时表是否存在
- [VB.NET]我想判断光标是否在A控件上,在就触发一个事件,没在也触发一个事件,请各位高人想个办法,在线等待
- ASP.NET判断用户是否在线