centos 下安装scrapy过程及遇到的问题
2015-01-04 11:06
429 查看
问题:
1. centos6.6下自带安装的是python2.6.6(yum使用不了python2.7),而我们使用scrapy需要使用的是python2.7,因此会有一些麻烦,幸运的是在python2.6.6和python2.7是可以共存的。
vim /usr/bin/scarpy 去修改首行的内容,改为python2.7的路径(我的是/usr/local/bin/Python2.7),或者使用软链接。
2.No package 'libffi' found ——》pkg_resources.DistributionNotFound:
cryptography>=0.2.1
对于这个问题google更准备的给出了答案。(/article/11334774.html)
1 安装好的 scrapy 运行出现
pkg_resources.DistributionNotFound: cryptography>=0.2.1 于是 运行
easy_install cryptography 但是报 No package 'libffi' found错误
2 检查 yum install libffi 但是提示 libffi 已经安装
3 由于 easy_install cryptography是编译安装 所以 需要libffi-devel
运行 yum install libffi-devel 按照后 再运行 easy_install cryptography就可以顺利通过了。
具体安装流程:
Centos下安装Scrapy(转自/article/5507399.html)
Scrapy是一个开源的机遇twisted框架的python的单机爬虫,该爬虫实际上包含大多数网页抓取的工具包,用于爬虫下载端以及抽取端。
安装环境:
安装步骤:
1.下载python2.7 http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz
验证python2.7安装
2.安装setuptools,http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz
3.安装Twisted
Twisted要安装zope.interface,可以从下面地址下载
zope.interface:http://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.1.tar.gz
twisted:http://twistedmatrix.com/Releases/Twisted/12.1/Twisted-12.1.0.tar.bz2
5.安装w3lib
w3lib:http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz
6.安装libxml2或者用easy_install安装lxml
验证lxml安装
也可以安装libxml2,官网上推荐安装2.6.28或者以上的版本,但在官网上没找到,我先是安装的2.6.9的版本,运行scrapy时报以下错误
升级到2.6.21版本以后解决了。
libxml2.6.1:ftp://xmlsoft.org/libxml2/python/libxml2-python-2.6.21.tar.gz
7.安装pyOpenSSL(这个是可选安装的,主要为了使scrapy能够支持https)
用easy_install pyOpenSSL安装的是pyOpenSSL-0.13版本,没安装成功,于是手动下载.011版本来进行安装。
pyOpenSSL:http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz
8.安装scrapy
验证安装
scrapy:http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.4.tar.gz
总结:
pyOpenSSL单独安装的时候不成功,也可以先下载pyOpenSSL0.11进行安装,再使用easy_install -U Scrapy进行全程安装
1. centos6.6下自带安装的是python2.6.6(yum使用不了python2.7),而我们使用scrapy需要使用的是python2.7,因此会有一些麻烦,幸运的是在python2.6.6和python2.7是可以共存的。
vim /usr/bin/scarpy 去修改首行的内容,改为python2.7的路径(我的是/usr/local/bin/Python2.7),或者使用软链接。
2.No package 'libffi' found ——》pkg_resources.DistributionNotFound:
cryptography>=0.2.1
对于这个问题google更准备的给出了答案。(/article/11334774.html)
1 安装好的 scrapy 运行出现
pkg_resources.DistributionNotFound: cryptography>=0.2.1 于是 运行
easy_install cryptography 但是报 No package 'libffi' found错误
2 检查 yum install libffi 但是提示 libffi 已经安装
3 由于 easy_install cryptography是编译安装 所以 需要libffi-devel
运行 yum install libffi-devel 按照后 再运行 easy_install cryptography就可以顺利通过了。
具体安装流程:
Centos下安装Scrapy(转自/article/5507399.html)
Scrapy是一个开源的机遇twisted框架的python的单机爬虫,该爬虫实际上包含大多数网页抓取的工具包,用于爬虫下载端以及抽取端。
安装环境:
centos6.6 python2.7.3
安装步骤:
1.下载python2.7 http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz
[root@zxy-websgs ~]# wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz -P /opt [root@zxy-websgs opt]# tar xvf Python-2.7.3.tgz [root@zxy-websgs Python-2.7.3]# ./configure [root@zxy-websgs Python-2.7.3]# make && make install
验证python2.7安装
[root@zxy-websgs Python-2.7.3]# python2.7 Python 2.7.3 (default, Feb 28 2013, 03:08:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> exit()
2.安装setuptools,http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz
[root@zxy-websgs ~]# wget http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz -P /opt/ [root@zxy-websgs opt]# tar zxvf setuptools-0.6c11.tar.gz [root@zxy-websgs setuptools-0.6c11]# python2.7 setup.py install
3.安装Twisted
[root@zxy-websgs setuptools-0.6c11]# easy_install Twisted ...... Installed /usr/local/lib/python2.7/site-packages/Twisted-12.3.0-py2.7-linux-x86_64.egg ...... Installed /usr/local/lib/python2.7/site-packages/zope.interface-4.0.4-py2.7-linux-x86_64.egg
Twisted要安装zope.interface,可以从下面地址下载
zope.interface:http://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.1.tar.gz
twisted:http://twistedmatrix.com/Releases/Twisted/12.1/Twisted-12.1.0.tar.bz2
5.安装w3lib
[root@zxy-websgs setuptools-0.6c11]# easy_install -U w3lib Searching for w3lib Reading http://pypi.python.org/simple/w3lib/ Reading http://github.com/scrapy/w3lib Best match: w3lib 1.2 Downloading http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz#md5=f929d5973a9fda59587b09a72f185a9e Processing w3lib-1.2.tar.gz Running w3lib-1.2/setup.py -q bdist_egg --dist-dir /tmp/easy_install-wm_1BB/w3lib-1.2/egg-dist-tmp-2DQHY_ zip_safe flag not set; analyzing archive contents... Adding w3lib 1.2 to easy-install.pth file Installed /usr/local/lib/python2.7/site-packages/w3lib-1.2-py2.7.egg Processing dependencies for w3lib Finished processing dependencies for w3lib
w3lib:http://pypi.python.org/packages/source/w/w3lib/w3lib-1.2.tar.gz
6.安装libxml2或者用easy_install安装lxml
[root@zxy-websgs lxml-3.1.0]# easy_install lxml
验证lxml安装
[root@zxy-websgs lxml-3.1.0]# python2.7 Python 2.7.3 (default, Feb 28 2013, 03:08:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import lxml >>> exit()
也可以安装libxml2,官网上推荐安装2.6.28或者以上的版本,但在官网上没找到,我先是安装的2.6.9的版本,运行scrapy时报以下错误
Traceback (most recent call last): File "/usr/local/bin/scrapy", line 5, in <module> pkg_resources.run_script('Scrapy==0.14.4', 'scrapy') File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 489, in run_script File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 1207, in run_script File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/EGG-INFO/scripts/scrapy", line 4, in <module> execute() File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/cmdline.py", line 112, in execute cmds = _get_commands_dict(inproject) File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/cmdline.py", line 37, in _get_commands_dict cmds = _get_commands_from_module('scrapy.commands', inproject) File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/cmdline.py", line 30, in _get_commands_from_module for cmd in _iter_command_classes(module): File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/cmdline.py", line 21, in _iter_command_classes for module in walk_modules(module_name): File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/utils/misc.py", line 65, in walk_modules submod = __import__(fullpath, {}, {}, ['']) File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/commands/shell.py", line 8, in <module> from scrapy.shell import Shell File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/shell.py", line 14, in <module> from scrapy.selector import XPathSelector, XmlXPathSelector, HtmlXPathSelector File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/selector/__init__.py", line 30, in <module> from scrapy.selector.libxml2sel import * File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/selector/libxml2sel.py", line 12, in <module> from .factories import xmlDoc_from_html, xmlDoc_from_xml File "/usr/local/lib/python2.7/site-packages/Scrapy-0.14.4-py2.7.egg/scrapy/selector/factories.py", line 14, in <module> libxml2.HTML_PARSE_NOERROR + \ AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
升级到2.6.21版本以后解决了。
libxml2.6.1:ftp://xmlsoft.org/libxml2/python/libxml2-python-2.6.21.tar.gz
7.安装pyOpenSSL(这个是可选安装的,主要为了使scrapy能够支持https)
用easy_install pyOpenSSL安装的是pyOpenSSL-0.13版本,没安装成功,于是手动下载.011版本来进行安装。
[root@zxy-websgs opt]# wget http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz -P /opt [root@zxy-websgs opt]# tar zxvf pyOpenSSL-0.11.tar.gz [root@zxy-websgs pyOpenSSL-0.11]# python2.7 setup.py install
pyOpenSSL:http://launchpadlibrarian.net/58498441/pyOpenSSL-0.11.tar.gz
8.安装scrapy
[root@zxy-websgs pyOpenSSL-0.11]# easy_install -U Scrapy
验证安装
[root@zxy-websgs pyOpenSSL-0.11]# scrapy Scrapy 0.16.4 - no active project Usage: scrapy <command> [options] [args] Available commands: fetch Fetch a URL using the Scrapy downloader runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directory Use "scrapy <command> -h" to see more info about a command
scrapy:http://pypi.python.org/packages/source/S/Scrapy/Scrapy-0.14.4.tar.gz
总结:
pyOpenSSL单独安装的时候不成功,也可以先下载pyOpenSSL0.11进行安装,再使用easy_install -U Scrapy进行全程安装
相关文章推荐
- centos7+mono4+jexus5.6.2安装过程中的遇到的问题
- Ubuntu下安装scrapy遇到的问题及解决过程
- Scrapy安装过程中遇到的问题及解决方法
- 记录使用vmware workstation11安装CentOS-6.6-x86_64-minimal.iso过程中遇到的问题及解决办法
- 安装Scrapy过程中遇到的问题
- 在CentOS上安装FTP服务器过程中遇到的问题
- 记从安装centos系统在到使用mono3.2部署MVC过程遇到的问题
- centos安装hadoop过程中遇到的其他问题
- CentOS 6.4 Nginx 安装过程中遇到的两个问题
- CentOS7.0下安装和配置zabbix2.4.5全过程及解决一些遇到的问题
- win7下安装linux(CentOS)过程中遇到的问题总结
- centos 7 安装scrapy遇到的问题
- 记从安装centos系统在到使用mono3.2部署MVC过程遇到的问题
- mac安装Scrapy过程以及遇到的问题
- CentOS6.5(带图形安装)在使用过程中遇到的一些网络问题迷惑(关于联网)
- 安装Scrapy过程中遇到的几个问题总结
- 安装CentOS过程遇到的问题
- centos6.5安装hadoop集群过程及遇到的问题
- CentOS 7安装过程中遇到的问题总结