pandas 常用函数
2017-05-17 22:54
417 查看
1.主要讲的是当index存在重复值的时候, 可以用 obj.index.is_unique 判断,获取重复index的值的时候obj['a'],返回的所有重复的index的值。 2.dataframe 常用的算术统计函数,https://chrisalbon.com/python/pandas_dataframe_descriptive_stats.html 函数list 参见, python 数据分析, P139 ,table 5-10 3.import pandas_datareader as web 可以采集股票数据作为统计样本,支持的web及使用方式,见下表。
https://pandas-datareader.readthedocs.io/en/latest/
(1)series 和 series
returns.MSFT.corr(returns.IBM) 相关系数
returns.MSFT.cov(returns.IBM) 协方差
(2)frame 自相关
returns.corr()
returns.cov() (3)frame 和 series 相关
returns.corrwith(returns.IBM) (4)frame 和 frame 相关
returns.corrwith(volumn)
import numpy as np from pandas import DataFrame , Series print ("Axis indexes with duplicate values") obj=Series(range(5),index =['a','a','b','b','c']) print("obj is \n", obj) print("obj.index.is_unique is ",obj.index.is_unique) print("obj['a'] is \n", obj['a']) print("obj['b'] is \n",obj['b']) df=DataFrame(np.random.randn(4,3),index=['a','a','b','b']) print("df is \n",df) print("df.ix['b'] is \n ",df.ix['b']) df = DataFrame([[1.4, np.nan], [7.1, -4.5], [np.nan, np.nan], [0.75, -1.3]],index=['a', 'b', 'c', 'd'],columns=['one','two']) print("df is \n",df) print("Calling dafaframe's sum method returns a Series containing column sums") print("df.sum() is \n",df.sum()) print("passing axis=1 sums over the rows instead") print("df.sum(axis=1) \n", df.sum(axis=1)) print("NA values are excluded unless the entire slice is NA.this can be disabled using the skipna option") print("df.mean(axis=1,skipna=False \n ",df.mean(axis=1,skipna=False)) print("df.idxmax() return indirect statistics like the index value where the maximum values are attained \n",df.idxmax()) print("df.cumsum() return cumulative sum of values \n",df.cumsum()) print("df.describe() return multiple summary statistics in one shot \n",df.describe()) obj=Series(['a','a','b','c']*4) print("obj is \n",obj) print("obj.describe() return alternate summary statistics \n",obj.describe())
import pandas_datareader as web https://pandas-datareader.readthedocs.io/en/latest/ all_data={} for ticker in ['AAPL','IBM', 'MSFT', 'GOOG']: all_data[ticker] = web.get_data_google(ticker,'1/1/2016','1/1/2017') print("all data is \n ", all_data) price = DataFrame({tic: data['Close'] for tic, data in all_data.items()}) volume = DataFrame({tic: data['Volume'] for tic, data in all_data.items()}) returns = price.pct_change() print("returns.tail()\n",returns.tail()) print("returns.MSFT.corr(returns.IBM) \n",returns.MSFT.corr(returns.IBM)) print("returns.MSFT.cov(returns.IBM) \n", returns.MSFT.cov(returns.IBM)) print("returns.corr() \n", returns.corr()) print("returns.cov() \n", returns.cov()) print("returns.corrwith(returns.IBM) \n",returns.corrwith(returns.IBM)) print("volumn is \n",volume) print("returns.corrwith(volumn) \n",returns.corrwith(volume))
相关文章推荐
- pandas数据处理常用函数demo之缺失值/merge/concact/绘图
- pandas常用函数
- 数据分析处理库Pandas-常用函数
- pandas常用函数使用备忘
- Pandas常用函数小结
- pandas 常用函数
- pandas常用函数
- pytho 4000 n之pandas库学习常用函数
- pandas 常用的函数
- pandas常用功能与函数介绍(结合实例,持续更新)
- pandas常用的数据分析函数(一)
- pandas做数据分析(四):常用函数
- Python-pandas常用函数
- Pandas常用函数入门
- pandas常用函数
- pandas 学习(二)—— pandas 下的常用函数
- Python拓展包:Numpy,pandas...常用函数
- Python之Pandas库常用函数大全(含注释)
- pandas 常用函数大全
- Github上Pandas,Numpy和 Scipy三个库中20个最常用的函数(1)