您的位置:首页 > 编程语言 > Python开发

Scikit-learn-python机器学习工具入门学习

2014-05-21 19:30 771 查看
1、下载
https://github.com/scikit-learn/scikit-learn
官网:http://scikit-learn.org/stable/

2、安装

参考官网文档,需要numpy、scipy,我直接尝试在文件目录下

sudo python setup.py install
出现错误,提示如下:

>>> import sklearn
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "sklearn/__init__.py", line 37, in <module>
from . import __check_build
File "sklearn/__check_build/__init__.py", line 46, in <module>
raise_build_error(e)
File "sklearn/__check_build/__init__.py", line 41, in raise_build_error
%s""" % (e, local_dir, ''.join(dir_content).strip(), msg))
ImportError: No module named _check_build
___________________________________________________________________________
Contents of sklearn/__check_build:
__init__.py               __init__.pyc              _check_build.c
_check_build.pyx          setup.py                  setup.pyc
___________________________________________________________________________
It seems that scikit-learn has not been built correctly.

If you have installed scikit-learn from source, please do not forget
to build the package before using it: run `python setup.py install` or
`make` in the source directory.

If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform.


尝试着重新安装numpy scipy 才发现Mac系统自己已经自带了许多类库了,如下:

CoreGraphics/
OpenSSL/
PyObjC/
Twisted-12.2.0-py2.7.egg-info/
altgraph/
altgraph-0.10.1-py2.7.egg-info/
bdist_mpkg/
bdist_mpkg-0.4.4-py2.7.egg-info/
bonjour/
dateutil/
macholib/
macholib-1.5-py2.7.egg-info/
matplotlib/
modulegraph/
modulegraph-0.10.1-py2.7.egg-info/
mpl_toolkits/
numpy/
py2app/
py2app-0.7.1-py2.7.egg-info/
python_dateutil-1.5-py2.7.egg-info/
pytz/
pytz-2012d-py2.7.egg-info/
scipy/
setuptools/
setuptools-0.6c12dev_r88846-py2.7.egg-info/
twisted/
xattr/
xattr-0.6.4-py2.7.egg-info/
zope/
zope.interface-3.8.0-py2.7.egg-info/
后来尝试了好几种方法,使用pip和easy_install的方法,分别报错。我就在site-packages下删除了原来的文件,然后重新安装了,就成功了。(刚开始失败的原因可能是没有把终端重启,重新进入python)

3、测试学习

➜  ~  python
Python 2.7.5 (default, Sep 12 2013, 21:33:34)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()
>>> print(digits.data)
[[  0.   0.   5. ...,   0.   0.   0.]
[  0.   0.   0. ...,  10.   0.   0.]
[  0.   0.   0. ...,  16.   9.   0.]
...,
[  0.   0.   1. ...,   6.   0.   0.]
[  0.   0.   2. ...,  12.   0.   0.]
[  0.   0.  10. ...,  12.   1.   0.]]
>>>


4、后续计划

想跟着自带的例子,将机器学习的常用算法做一个后续的总结,是不错的学习资料。
http://scikit-learn.org/stable/auto_examples/feature_selection_pipeline.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: