您的位置：首页 > 编程语言 > Python开发

Python 插件杂谈 (3) ---- PyQuery , Python中的网页分析利器

2010-08-15 07:56 387 查看

嗯哼，Meego中文核心站
-- 米趣网
又发新博文啦， PyQuery
是Python
下用于 分析网页内容
的工具。有点像 BeautifulSoup

，但是功能更近似于 JQuery
,相信用过 BeautifulSoup

的朋友对他的操作深有印象，远不及 JQuery的语法来得简单明了。

先来段介绍：

PyQuery

允许你使用 JQuery 的语法访问 XML 文档。PyQuery的API尽可能地与JQuery相似。PyQuery

使用 lxml 快速分析 xml 和 html文档。

但是 PyQuery

不是(起码不再是)用来生成javascript 或者与javascript代码的库。我（作者本人）只是喜欢 JQuery 的 api 而在Python中找不到这样的工具，所以我告诉我自己在Python中制造这样的一个工具，于是就有了PyQuery

的产生。

PyQuery

可以用来实现多种用途，有一个主意我未来可能实现的，那就是使用它来模板化纯粹的http模板。我也用它进行网页抓取进而实现装饰Deliverance 应用程序。

接着来段Demo,让大家对 PyQuery

有个认识:

>>> from pyquery import PyQuery as pq

>>> from lxml import etree

>>> import urllib

>>> d = pq("<html></html>")

>>> d = pq(etree.fromstring("<html></html>"))

>>> d = pq(url='http://google.com/')

>>> d = pq(url='http://google.com/', opener=lambda url: urllib.urlopen(url).read())

>>> d = pq(filename=path_to_html_file)

复制代码

上面的 d 有点像 jQuery 中的 $

>>> d("#hello")

[<p#hello.hello>]

>>> p = d("#hello")

>>> p.html()

'Hello world !'

>>> p.html("you know <a href='http://python.org/'>Python</a> rocks")

[<p#hello.hello>]

>>> p.html()

u'you know <a href="http://python.org/">Python</a> rocks'

>>> p.text()

'you know Python rocks'

复制代码

你可以使用很多 JQuery 类似的语法，不过不包括那样不在css标准中的语法，如：:first :last :even

dd :eq :lt :gt :checked :selected :file:

>>> d('p:first')

[<p#hello.hello>]

复制代码

最后，我就不啰嗦了，为大家提供一下， PyQuery的文档

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航