您的位置：首页 > 编程语言 > Python开发

Pandas学习笔记

2016-11-05 09:21 375 查看

初次接触Python，之前有一定的编程基础，入门Python还是相对简单的。平时工作有一部分是要做数据分析，数据量要较大，用Excel已经能手动处理大部分数据。但是由于步骤之多，又具有重复性，会用到VBA辅助处理。但Excel+VBA在处理大数据量时，仍然有一些局限性。希望学习python可以用另一种思想去看待数据。

目前正在Coursera上学习Introduction to Data Science in Python，学习的内容主要是掌握各种模块的功能。

由于之前没有深入接触python，很多功能还不清楚，在这里整理一下学习笔记，也是从最简单的开始。

import pandas

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

pandas data structure:

pd.Series([list value]) # 1D labeled

pd.DataFrame({numpy array or dict of objects}) # 2D labeled

pd.Panel # 3D labeled

创建pandas对象（引用自pandas documentation）

df1 = pd.Series([1,2,3,np.nan,6,9])
dates = pd.date_range('20130101', periods=6)
df2 = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
df3 = pd.DataFrame({'A': 1.,
'B': pd.Timestamp('20130102'),
'C': pd.Series(1,index=list(range(4)),dtype='float32'),
'D': np.array([3] * 4,dtype='int32'),
'E': pd.Categorical(["test","train","test","train"]),
'F': 'foo' })
df2.dtypes

pd.loc[]

pd.iloc[]

pd[]

pd.merge(staff_df, student_df, how='outer', left_index=True, right_index=True)

how='outer'

#不相交的数据集

how='inter'

#相交的数据集

how='left'

#左数据集staff_df相对右数据集 student_df的相交集和

how='right'

#右数据集 student_df相对左数据集staff_df的相交集和

创建一列新的数据，首先创建该列，再对每一行赋值。如：

df['newColumn'] = None

df['newColumn'] = [1,2,3,4]

重置index

df.reset_index()

df.set_index('Name')

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： python pandas

相关文章推荐

新的分享

章节导航