Beautifulsoup的用法
2016-10-28 20:13
99 查看
#coding:utf-8
from bs4 import BeautifulSoup
import re
doc = ['<html><head><title>Page title</title></head>',
'<body><p id="firstpara" align="center">This is paragraph <b>one</b>.</p>',
'<p id="secondpara" align="blah">This is paragraph <b>two</b>.</p>',
'</html>']
soup = BeautifulSoup(''.join(doc))
# <html>
# <head>
# <title>
# Page title
# </title>
# </head>
# <body>
# <p id="firstpara" align="center">
# This is paragraph
# <b>
# one
# </b>
# .
# </p>
# <p id="secondpara" align="blah">
# This is paragraph
# <b>
# two
# </b>
# .
# </p>
# </body>
# </html>
tieleTag = soup.html.head.title
print tieleTag
#<title>Page title</title>
print tieleTag.string
#Page title
print len(soup('p'))
#获取p标签的个数
print soup.find('p',align="center")
#获取p标签align属性为center的语句
print soup('p',align="center")[0]['id']
#获取解析后第一个p标签的id
print soup.find('p').b.string #查找p标签的b标签的内容
print soup('p')[1].b.string #查找p标签的第二个b标签的内容
tieleTag['id'] = 'theTitle' #修改soup
soup.p.extract() #移除第一个p标签
print soup
from bs4 import BeautifulSoup
import re
doc = ['<html><head><title>Page title</title></head>',
'<body><p id="firstpara" align="center">This is paragraph <b>one</b>.</p>',
'<p id="secondpara" align="blah">This is paragraph <b>two</b>.</p>',
'</html>']
soup = BeautifulSoup(''.join(doc))
# <html>
# <head>
# <title>
# Page title
# </title>
# </head>
# <body>
# <p id="firstpara" align="center">
# This is paragraph
# <b>
# one
# </b>
# .
# </p>
# <p id="secondpara" align="blah">
# This is paragraph
# <b>
# two
# </b>
# .
# </p>
# </body>
# </html>
tieleTag = soup.html.head.title
print tieleTag
#<title>Page title</title>
print tieleTag.string
#Page title
print len(soup('p'))
#获取p标签的个数
print soup.find('p',align="center")
#获取p标签align属性为center的语句
print soup('p',align="center")[0]['id']
#获取解析后第一个p标签的id
print soup.find('p').b.string #查找p标签的b标签的内容
print soup('p')[1].b.string #查找p标签的第二个b标签的内容
tieleTag['id'] = 'theTitle' #修改soup
soup.p.extract() #移除第一个p标签
print soup
相关文章推荐
- BeautifulSoup基本用法总结
- Python爬虫--beautifulsoup 4 用法
- BeautifulSoup库用法总结
- $python爬虫系列(2)—— requests和BeautifulSoup库的基本用法
- BeautifulSoup4用法笔记
- python:BeautifulSoup select()/select_one() 用法总结
- BeautifulSoup 用法和实例
- Python的Requests库和Beautifulsoup第三方库一些用法及定义
- python2 之BeautifulSoup用法
- BeautifulSoup_CSS_Select 用法和实例
- beautifulsoup的简单用法
- python简单爬虫 及 beautifulSoup简单用法
- Python爬虫BeautifulSoup用法(1)
- BeautifulSoup_find_ 用法和实例
- beautifulsoup的简单用法
- BeautifulSoup 用法 标签属性值不确定时用法
- 【爬虫】python之BeautifulSoup用法
- python爬虫--BeautifulSoup的简单用法
- BeautifulSoup和lxml的基本用法示例
- Python爬虫辅助库BeautifulSoup4用法精要