您的位置:首页 > 编程语言 > Python开发

Python.Following Links in HTML Using BeautifulSoup

2016-07-22 20:01 1186 查看
The program will use urllib to
read the HTML from the data files below, extract the href= vaues from the anchor tags, scan for a tag that is in a particular position relative to the first name in the list, follow that link and repeat the process a number of times and report the last name
you find.

Find the
link at position 18 (the
first name is 1). Follow that link. Repeat this process 7 times.
The answer is the last name that you retrieve.
Hint: The first character of the name of the last page that you will load is: M

HTML地址:http://python-data.dr-chuck.net/known_by_Cleo.html

Python源码:

<span style="font-size:12px;">import urllib
from bs4 import BeautifulSoup

url = raw_input('Enter - ')
count = int(raw_input('Enter count:'))
position = int(raw_input('Enter position:'))

for tag in xrange(count):
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html,'html.parser')
tags = soup.findAll('a')
url = tags[position-1].get('href', None)
print url</span>
运行结果:
Enter - http://python-data.dr-chuck.net/known_by_Cleo.html Enter count:7
Enter position:18 http://python-data.dr-chuck.net/known_by_Mirrin.html[/code] 
                                            
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: