条件过滤
2016-01-13 21:58
232 查看
----------------基于numpy
world_alcolhol是numpy的array类型。输入:matrix; 输出:matrix
---------------基于pandas
pd.isnull() 输入:dataframe;输出:vector
输入:dataframe;输出:vector
输入:dataframe;输出:dataframe
根据列空值过滤行 dropna()
输入:dataframe;输出:dataframe
输入:series;输出:series
world_alcolhol是numpy的array类型。输入:matrix; 输出:matrix
# Boolean vector corresponding to Canada and 1986. canada_1986_boolean = (world_alcohol[:,2] == "Canada") & (world_alcohol[:,0] == "1986") # We can then use canada_1986 to subset a matrix -- it's just a normal boolean vector print(world_alcohol[canada_1986_boolean,:])
---------------基于pandas
pd.isnull() 输入:dataframe;输出:vector
#dataframe is titanic_survival, the age column has null values(NaN) which need to be excluded while calculation # age_null is a boolean vector, and has "True" where age is NaN, and "False" where it isn't age_null = pd.isnull(titanic_survival["age"]) # then use this boolean to filter age column survival_with_valid_age = titanic_survival["age"][age_null==False] # do calculation correct_sum = sum(survival_with_valid_age) correct_mean_age=correct_sum/len(survival_with_valid_age)
输入:dataframe;输出:vector
pclass_survival = titanic_survival["fare"][titanic_survival["pclass"]==2] fare_for_class = pclass_survival.mean()
输入:dataframe;输出:dataframe
selected_table=table[table['Major_category']==major]
根据列空值过滤行 dropna()
输入:dataframe;输出:dataframe
# do calculation drop rows if certain columns have missing values new_titanic_survival = titanic_survival.dropna(subset=["age", "body","home.dest"]).reset_index(drop=True)
输入:series;输出:series
#series series_film=fandango['FILM'] series_rt=fandango['RottenTomatoes'] #use film names as index series_custom = Series(series_rt.values,index= series_film.values) #filter criteria_one = series_custom > 50 criteria_two = series_custom < 75 both_criteria = series_custom[criteria_one & criteria_two]
相关文章推荐
- iOS XML,JOSN数据解析
- 【SAP UI】PBO和PAI
- 列处理——列间运算
- Fast R-CNN
- 自然语言处理(1)——文本分词
- H2O是开源基于大数据的机器学习库包
- 【SAP综合】BDT和XO的应用心得
- activate-power-mode 写代码的时候体验狂拽酷炫的效果 (IDEA版安装过程及问题)
- Git版本控制器的基本使用
- H2O是开源基于大数据的机器学习库包
- MySQL 中 EXISTS 的用法
- xwiki操作手册
- 【MySql】ERROR 1045 (28000): Access denied for user 'ambari'@'localhost' (using password: YES)
- 用R建立岭回归和lasso回归
- UITabelView图片加 label 显示
- Opensource开源精神
- Linux中RPM包管理
- 【MYSQL】修改密码
- Leetcode230: Power of Three
- 【SAP增强】增强