您的位置：首页 > 编程语言 > Python开发

Python 学习中遇到的各种问题

2015-11-13 21:49 645 查看

O’Reilly出版的Wes McKenny编的《Python for Data Analysis》，采用Anaconda3集成环境

1.1 Movielens数据的处理例子,输出前五个用户信息。代码如下：

import pandas as pd
unames = ['user_id', 'gender', 'age', 'occupationb', 'zip']
users = pd.read_table('ch02/movielens/users.dat', sep = "::", header = None, names = uname)

报错信息：

D:\Program Files\Anaconda\lib\site-packages\pandas\io\parsers.py:648: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators;you can avoid this warning by specifying engine='python'.  ParserWarning)

修改方法：在users语句末尾加上 engine=’python’

users = pd.read_table('ch02/movielens/users.dat', sep = "::", header = None, names = uname，engine='python')

1.2 在MovieLens 1M数据集例子，使用pivot_table()按性别计算每部电影的平均得分

mean_ratings = data.pivot_table('rating', rows = 'title', cols = 'gender', aggfunc = 'mean')

报错信息：

Traceback (most recent call last):

File "<ipython-input-28-669a36c33797>", line 1, in <module>
mean_ratings = data.pivot_table('rating', rows = 'title', cols = 'gender', aggfunc = 'mean')

TypeError: pivot_table() got an unexpected keyword argument 'rows'

修改方法：用 index 替换 rows，用 columns 替换 cols

mean_ratings = data.pivot_table('rating', index= 'title', columns= 'gender', aggfunc = 'mean')

1.3 Numpy.cumsum 计算累积和

# 创建 5*4 的数组
In[1]: arr = np.arange(1, 21).reshape(5, 4)
Out[1]:
([[ 1,  2,  3,  4],
[ 5,  6,  7,  8],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20]])

# numpy.cumsum() 求取累计和
In [2]: arr.cumsum()
Out[2]:
array([  1,   3,   6,  10,  15,  21,  28,  36,  45,  55,  66,  78,  91, 105, 120, 136, 153, 171, 190, 210])

# numpy.cumsum(0) 当参数为0时， 按行累计和，在每一列中累加
In [3]: arr.cumsum(0)
Out[3]:
array([[ 1,  2,  3,  4],
[ 6,  8, 10, 12],
[15, 18, 21, 24],
[28, 32, 36, 40],
[45, 50, 55, 60]])
# numpy.cumsum(1) 当参数为1时， 按列累计和，在每一行中累加
In [4]: arr.cumsum(1)
Out[4]:
array([[ 1,  3,  6, 10],
[ 5, 11, 18, 26],
[ 9, 19, 30, 42],
[13, 27, 42, 58],
[17, 35, 54, 74]])

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航

Python 学习中遇到的各种问题

O’Reilly出版的Wes McKenny编的《Python for Data Analysis》， 采用Anaconda3集成环境

1.1 Movielens数据的处理例子,输出前五个用户信息。代码如下：

1.2 在MovieLens 1M数据集例子，使用pivot_table()按性别计算每部电影的平均得分

1.3 Numpy.cumsum 计算累积和

O’Reilly出版的Wes McKenny编的《Python for Data Analysis》，采用Anaconda3集成环境