pandas入门——数据合并merge函数
2017-08-15 19:56
726 查看
数据合并merge函数
创建数据集# 导入pandas和numpy包 import pandas as pd import numpy as np # 创建两个数据框 df_left = pd.DataFrame(data=np.ones((5,6)),columns=["a","b","c","d","e","f"],index=["k1","k2","k3","k4","k5"]) df_right = pd.DataFrame(data=np.ones((5,6))*2,columns=["e","f","g","h","j","k"],index=["k3","k4","k5","k6","k7"]) df_left["key1"] = ["k1","k0","k0","k1","k1"] df_left["key2"] = ["k0","k0","k1","k1","k0"] df_right["key1"] = ["k1","k0","k0","k0","k1"] df_right["key2"] = ["k0","k1","k1","k1","k0"] print(df_r 4000 ight) print(df_left)
e f g h j k key1 key2 k3 2.0 2.0 2.0 2.0 2.0 2.0 k1 k0 k4 2.0 2.0 2.0 2.0 2.0 2.0 k0 k1 k5 2.0 2.0 2.0 2.0 2.0 2.0 k0 k1 k6 2.0 2.0 2.0 2.0 2.0 2.0 k0 k1 k7 2.0 2.0 2.0 2.0 2.0 2.0 k1 k0 a b c d e f key1 key2 k1 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 k2 1.0 1.0 1.0 1.0 1.0 1.0 k0 k0 k3 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 k4 1.0 1.0 1.0 1.0 1.0 1.0 k1 k1 k5 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0
merge默认的合并方式是inner
print(pd.merge(left=df_left,right=df_right,on=["key1","key2"],how="inner"))
a b c d e_x f_x key1 key2 e_y f_y g h j k 0 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 1 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 2 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 3 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 4 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 5 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 6 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0
merge的合并方式是outer 并显示出merge的方式
pd.merge(left=df_left,right=df_right,on=["key1","key2"],how="outer",indicator=True)
a b c d e_x f_x key1 key2 e_y f_y g h j k _merge 0 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 both 1 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 both 2 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 both 3 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 both 4 1.0 1.0 1.0 1.0 1.0 1.0 k0 k0 NaN NaN NaN NaN NaN NaN left_only 5 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 both 6 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 both 7 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 both 8 1.0 1.0 1.0 1.0 1.0 1.0 k1 k1 NaN NaN NaN NaN NaN NaN left_only
使用left的方式进行合并 并指定索引位进行合并
pd.merge(left=df_left,right=df_right,on=["key1","key2"],how="left",left_index=True,right_index=True,indicator=True)
a b c d e_x f_x key1 key2 e_y f_y g h j k _merge k1 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 NaN NaN NaN NaN NaN NaN left_only k2 1.0 1.0 1.0 1.0 1.0 1.0 k0 k0 NaN NaN NaN NaN NaN NaN left_only k3 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 both k4 1.0 1.0 1.0 1.0 1.0 1.0 k1 k1 2.0 2.0 2.0 2.0 2.0 2.0 both k5 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 both
使用right的方式进行合并 并指定索引位进行合并 且对数据追加后缀
pd.merge(left=df_left,right=df_right,on=["key1","key2"],how="right",left_index=True,right_index=True,indicator=True,suffixes=("_left","_right"))
a b c d e_left f_left key1 key2 e_right f_right g h j k _merge k3 1.0 1.0 1.0 1.0 1.0 1.0 k0 k1 2.0 2.0 2.0 2.0 2.0 2.0 both k4 1.0 1.0 1.0 1.0 1.0 1.0 k1 k1 2.0 2.0 2.0 2.0 2.0 2.0 both k5 1.0 1.0 1.0 1.0 1.0 1.0 k1 k0 2.0 2.0 2.0 2.0 2.0 2.0 both k6 NaN NaN NaN NaN NaN NaN NaN NaN 2.0 2.0 2.0 2.0 2.0 2.0 right_only k7 NaN NaN NaN NaN NaN NaN NaN NaN 2.0 2.0 2.0 2.0 2.0 2.0 right_only
相关文章推荐
- pandas入门——数据合并concat函数
- Python数据分析入门-Pandas环境搭建
- 利用python进行数据分析-pandas入门3
- Python数据分析入门(一)-Pandas数据结构(Series)
- Python:基于pandas ,Pymatlab的 数据分析入门
- pandas数据合并
- Pandas导入数据后的,关于特征合并的细节
- PANDAS 数据合并与重塑(concat篇)
- Pandas 合并数据
- 数据可视化(二)Matplotlib pandas简易入门
- oracle编程入门笔记2015-01-12--数据合并
- Python数据分析入门-Pandas环境搭建
- pandas学习:简单入门数据的建立划分
- 数据提取(2):pandas库入门
- PANDAS 数据合并与重塑(concat篇)
- Pandas —— combine_first( )合并重叠数据
- Python数据分析入门之pandas基础总结
- pandas数据合并与重塑(pd.concat篇)
- Python 数据科学入门教程:Pandas
- Python数据分析 Pandas入门