【跟着stackoverflow学Pandas】- 删除带有NaN的行
2017-08-16 19:19
417 查看
最近做一个系列博客,跟着stackoverflow学Pandas。
专栏地址:http://blog.csdn.net/column/details/16726.html
以 pandas作为关键词,在stackoverflow中进行搜索,随后安照 votes 数目进行排序:
https://stackoverflow.com/questions/tagged/pandas?sort=votes&pageSize=15
可以接受Series 或者 DataFrame 类型的数据
pandas.notnull 被设计用来取代 np.isfinite / numpy.isnan
numpy.isnan 判断是否是NaN
drop 可以接受多个参数:
axis : {0 or ‘index’, 1 or ‘columns’}, or tuple/list thereof
Pass tuple or list to drop on multiple axes
how : {‘any’, ‘all’}
any : if any NA values are present, drop that label
all : if all values are NA, drop that label
thresh : int, default None
int value : require that many non-NA values
subset : array-like
Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include
inplace : boolean, default False
If True, do operation inplace and return None.
更多的可以参考,drop的官方说明。
专栏地址:http://blog.csdn.net/column/details/16726.html
以 pandas作为关键词,在stackoverflow中进行搜索,随后安照 votes 数目进行排序:
https://stackoverflow.com/questions/tagged/pandas?sort=votes&pageSize=15
How to drop rows of Pandas DataFrame whose value in certain columns is NaN - 删除带有NaN的行
数据准备
我们随机生成了10x3列的数据,然后针对某些数据赋值 NaN。import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(10,3), columns=['col1', 'col2', 'col3']) df.iloc[::2,0] = np.nan df.iloc[::4,1] = np.nan df.iloc[::3,2] = np.nan print df # col1 col2 col3 # 0 NaN NaN NaN # 1 -0.498336 -0.960804 0.705309 # 2 NaN -2.120032 2.123329 # 3 0.791883 -0.283840 NaN # 4 NaN NaN -1.241788 # 5 -0.399644 -0.968515 -1.509056 # 6 NaN 0.897637 NaN # 7 1.826128 1.015091 -0.497022 # 8 NaN NaN -1.889871 # 9 0.379287 -1.762229 NaN
pandas.notnull
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.notnull.html可以接受Series 或者 DataFrame 类型的数据
pandas.notnull 被设计用来取代 np.isfinite / numpy.isnan
pd.notnull(df['col1']) # 0 False # 1 True # 2 False # 3 True # 4 False # 5 True # 6 False # 7 True # 8 False # 9 True # Name: col1, dtype: bool print pd.notnull(df) # col1 col2 col3 # 0 False False False # 1 True True True # 2 False True True # 3 True True False # 4 False False True # 5 True True True # 6 False True False # 7 True True True # 8 False False True # 9 True True False
np.isfinite / numpy.isnan
np.isfinite 会对数据进行判断,如果是有限数据返回True。我们可以通过对不同列的bool值组合来满足我们的取值要求。numpy.isnan 判断是否是NaN
np.isfinite(df['col1']) # 1 True # 3 True # 5 True # 7 True # 9 True # Name: col1, dtype: bool df1 = df[np.isfinite(df['col1'])] print df1 # col1 col2 col3 # 1 -0.498336 -0.960804 0.705309 # 3 0.791883 -0.283840 NaN # 5 -0.399644 -0.968515 -1.509056 # 7 1.826128 1.015091 -0.497022 # 9 0.379287 -1.762229 NaN
drop
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.htmldrop 可以接受多个参数:
axis : {0 or ‘index’, 1 or ‘columns’}, or tuple/list thereof
Pass tuple or list to drop on multiple axes
how : {‘any’, ‘all’}
any : if any NA values are present, drop that label
all : if all values are NA, drop that label
thresh : int, default None
int value : require that many non-NA values
subset : array-like
Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include
inplace : boolean, default False
If True, do operation inplace and return None.
# 默认是删除有NaN的行 print df.dropna() # col1 col2 col3 # 1 1.944899 -1.792510 -0.612904 # 5 -0.609380 1.087689 -1.145582 # 7 -2.045037 1.043837 0.429135 print df.dropna(how='all') #删除全部是NaN的行 # col1 col2 col3 # 1 1.944899 -1.792510 -0.612904 # 2 NaN 0.780487 -1.239197 # 3 -1.050320 -0.121033 NaN # 4 NaN NaN -0.537213 # 5 -0.609380 1.087689 -1.145582 # 6 NaN -0.721761 NaN # 7 -2.045037 1.043837 0.429135 # 8 NaN NaN -0.096989 # 9 1.514520 0.224193 NaN
更多的可以参考,drop的官方说明。
相关文章推荐
- 【跟着stackoverflow学Pandas】Delete column from pandas DataFrame-删除列
- 【跟着stackoverflow学Pandas】Select rows from a DataFrame based on values in a column -pandas 筛选
- 【跟着stackoverflow学Pandas】How to iterate over rows in a DataFrame in Pandas-DataFrame按行迭代
- 【跟着stackoverflow学Pandas】“Large data” work flows using pandas-pandas大数据处理流程
- 【跟着stackoverflow学Pandas】add one row in a pandas.DataFrame -DataFrame添加行
- 【跟着stackoverflow学Pandas】 - Adding new column to existing DataFrame in Python pandas - Pandas 添加列
- 【跟着stackoverflow学Pandas】 - Pandas: change data type of columns - Pandas修改列的类型
- 【跟着stackoverflow学Pandas】-How do I get the row count of a Pandas dataframe-获取DataFrame行数
- 【跟着stackoverflow学Pandas】- apply、applymap、map 三者使用差异
- 【跟着stackoverflow学Pandas】Renaming columns in pandas-列的重命名
- 【跟着stackoverflow学Pandas】 -Get list from pandas DataFrame column headers - Pandas 获取列名
- 【跟着stackoverflow学Pandas】--Converting a Pandas GroupBy object to DataFrame-Groupby对象转换为DataFrame
- Python pandas检查数据中是否有NaN的几种方法
- 自定义带有删除标签的edittext
- python删除pandas DataFrame的某一/几列
- 带有上拉加载下拉刷新和可滑动删除功能的ListView的简单实现
- pandas 带有重复值的轴索引
- ListView 实现带有Filpper效果的左右滑动删除 Item
- 第18周oj项目复仇者联盟之关灯(删除带有m及是m倍数的数字)
- Android带有删除按钮的EditText:EditTextWithDeleteButton