您的位置:首页 > 其它

pandas to_dict 的用法

2017-11-04 16:07 309 查看
简介:pandas 中的to_dict 可以对DataFrame类型的数据进行转换

可以选择六种的转换类型,分别对应于参数 ‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’,下面逐一介绍每种的用法

Help on method to_dict in module pandas.core.frame:

to_dict(orient='dict') method of pandas.core.frame.DataFrame instance
Convert DataFrame to dictionary.

Parameters
----------
orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
Determines the type of the values of the dictionary.

- dict (default) : dict like {column -> {index -> value}}
- list : dict like {column -> [values]}
- series : dict like {column -> Series(values)}
- split : dict like
{index -> [index], columns -> [columns], data -> [values]}
- records : list like
[{column -> value}, ... , {column -> value}]
- index : dict like {index -> {column -> value}}

.. versionadded:: 0.17.0

Abbreviations are allowed. `s` indicates `series` and `sp`
indicates `split`.

Returns
-------
result : dict like {column -> {index -> value}}


1、选择参数orient=’dict’

dict也是默认的参数,下面的data数据类型为DataFrame结构, 会形成 {column -> {index -> value}}这样的结构的字典,可以看成是一种双重字典结构

- 单独提取每列的值及其索引,然后组合成一个字典

- 再将上述的列属性作为关键字(key),值(values)为上述的字典

查询方式为 :data_dict[key1][key2]

- data_dict 为参数选择orient=’dict’时的数据名

- key1 为列属性的键值(外层)

- key2 为内层字典对应的键值

data
Out[9]:
pclass        age     embarked                      home.dest     sex
1086    3rd  31.194181      UNKNOWN                        UNKNOWN    male
12      1st  31.194181    Cherbourg                  Paris, France  female
1036    3rd  31.194181      UNKNOWN                        UNKNOWN    male
833     3rd  32.000000  Southampton  Foresvik, Norway Portland, ND    male
1108    3rd  31.194181      UNKNOWN                        UNKNOWN    male
562     2nd  41.000000    Cherbourg                   New York, NY    male
437     2nd  48.000000  Southampton   Somerset / Bernardsville, NJ  female
663     3rd  26.000000  Southampton                        UNKNOWN    male
669     3rd  19.000000  Southampton                        England    male
507     2nd  31.194181  Southampton               Petworth, Sussex    male
In[10]: data_dict=data.to_dict(orient= 'dict')
In[11]: data_dict
Out[11]:
{'age': {12: 31.19418104265403,
437: 48.0,
507: 31.19418104265403,
562: 41.0,
663: 26.0,
669: 19.0,
833: 32.0,
1036: 31.19418104265403,
1086: 31.19418104265403,
1108: 31.19418104265403},
'embarked': {12: 'Cherbourg',
437: 'Southampton',
507: 'Southampton',
562: 'Cherbourg',
663: 'Southampton',
669: 'Southampton',
833: 'Southampton',
1036: 'UNKNOWN',
1086: 'UNKNOWN',
1108: 'UNKNOWN'},
'home.dest': {12: 'Paris, France',
437: 'Somerset / Bernardsville, NJ',
507: 'Petworth, Sussex',
562: 'New York, NY',
663: 'UNKNOWN',
669: 'England',
833: 'Foresvik, Norway Portland, ND',
1036: 'UNKNOWN',
1086: 'UNKNOWN',
1108: 'UNKNOWN'},
'pclass': {12: '1st',
437: '2nd',
507: '2nd',
562: '2nd',
663: '3rd',
669: '3rd',
833: '3rd',
1036: '3rd',
1086: '3rd',
1108: '3rd'},
'sex': {12: 'female',
437: 'female',
507: 'male',
562: 'male',
663: 'male',
669: 'male',
833: 'male',
1036: 'male',
1086: 'male',
1108: 'male'}}


2、当关键字orient=’ list’ 时

和1中比较相似,只不过内层变成了一个列表,结构为{column -> [values]}

查询方式为: data_list[keys][index]

data_list 为关键字orient=’list’ 时对应的数据名

keys 为列属性的键值,如本例中的’age’ , ‘embarked’等

index 为整型索引,从0开始到最后

In[19]: data_list=data.to_dict(orient='list')

In[20]: data_list
Out[20]:
{'age': [31.19418104265403,
31.19418104265403,
31.19418104265403,
32.0,
31.19418104265403,
41.0,
48.0,
26.0,
19.0,
31.19418104265403],
'embarked': ['UNKNOWN',
'Cherbourg',
'UNKNOWN',
'Southampton',
'UNKNOWN',
'Cherbourg',
'Southampton',
'Southampton',
'Southampton',
'Southampton'],
'home.dest': ['UNKNOWN',
'Paris, France',
'UNKNOWN',
'Foresvik, Norway Portland, ND',
'UNKNOWN',
'New York, NY',
'Somerset / Bernardsville, NJ',
'UNKNOWN',
'England',
'Petworth, Sussex'],
'pclass': ['3rd',
'1st',
'3rd',
'3rd',
'3rd',
'2nd',
'2nd',
'3rd',
'3rd',
'2nd'],
'sex': ['male',
'female',
'male',
'male',
'male',
'male',
'female',
'male',
'male',
'male']}


3、关键字参数orient=’series’

形成结构{column -> Series(values)}

调用格式为:data_series[key1][key2]或data_dict[key1]

data_series 为数据对应的名字

key1 为列属性的键值,如本例中的’age’ , ‘embarked’等

key2 使用数据原始的索引(可选)

In[21]: data_series=data.to_dict(orient='series')
In[22]: data_series
Out[22]:
{'age': 1086    31.194181
12      31.194181
1036    31.194181
833     32.000000
1108    31.194181
562     41.000000
437     48.000000
663     26.000000
669     19.000000
507     31.194181
Name: age, dtype: float64, 'embarked': 1086        UNKNOWN
12        Cherbourg
1036        UNKNOWN
833     Southampton
1108        UNKNOWN
562       Cherbourg
437     Southampton
663     Southampton
669     Southampton
507     Southampton
Name: embarked, dtype: object, 'home.dest': 1086                          UNKNOWN
12                      Paris, France
1036                          UNKNOWN
833     Foresvik, Norway Portland, ND
1108                          UNKNOWN
562                      New York, NY
437      Somerset / Bernardsville, NJ
663                           UNKNOWN
669                           England
507                  Petworth, Sussex
Name: home.dest, dtype: object, 'pclass': 1086    3rd
12      1st
1036    3rd
833     3rd
1108    3rd
562     2nd
437     2nd
663     3rd
669     3rd
507     2nd
Name: pclass, dtype: object, 'sex': 1086      male
12      female
1036      male
833       male
1108      male
562       male
437     female
663       male
669       male
507       male
Name: sex, dtype: object}


4、关键字参数orient=’split’

形成{index -> [index], columns -> [columns], data -> [values]}的结构,是将数据、索引、属性名单独脱离出来构成字典

调用方式有 data_split[‘index’],data_split[‘data’],data_split[‘columns’]

data_split=data.to_dict(orient='split')

data_split
Out[38]:
{'columns': ['pclass', 'age', 'embarked', 'home.dest', 'sex'],
'data': [['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
['1st', 31.19418104265403, 'Cherbourg', 'Paris, France', 'female'],
['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
['3rd', 32.0, 'Southampton', 'Foresvik, Norway Portland, ND', 'male'],
['3rd', 31.19418104265403, 'UNKNOWN', 'UNKNOWN', 'male'],
['2nd', 41.0, 'Cherbourg', 'New York, NY', 'male'],
['2nd', 48.0, 'Southampton', 'Somerset / Bernardsville, NJ', 'female'],
['3rd', 26.0, 'Southampton', 'UNKNOWN', 'male'],
['3rd', 19.0, 'Southampton', 'England', 'male'],
['2nd', 31.19418104265403, 'Southampton', 'Petworth, Sussex', 'male']],
'index': [1086, 12, 1036, 833, 1108, 562, 437, 663, 669, 507]}


5、当关键字orient=’records’ 时

形成[{column -> value}, … , {column -> value}]的结构

整体构成一个列表,内层是将原始数据的每行提取出来形成字典

调用格式为data_records[index][key1]

data_records=data.to_dict(orient='records')

data_records
Out[41]:
[{'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 31.19418104265403,
'embarked': 'Cherbourg',
'home.dest': 'Paris, France',
'pclass': '1st',
'sex': 'female'},
{'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 32.0,
'embarked': 'Southampton',
'home.dest': 'Foresvik, Norway Portland, ND',
'pclass': '3rd',
'sex': 'male'},
{'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 41.0,
'embarked': 'Cherbourg',
'home.dest': 'New York, NY',
'pclass': '2nd',
'sex': 'male'},
{'age': 48.0,
'embarked': 'Southampton',
'home.dest': 'Somerset / Bernardsville, NJ',
'pclass': '2nd',
'sex': 'female'},
{'age': 26.0,
'embarked': 'Southampton',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
{'age': 19.0,
'embarked': 'Southampton',
'home.dest': 'England',
'pclass': '3rd',
'sex': 'male'},
{'age': 31.19418104265403,
'embarked': 'Southampton',
'home.dest': 'Petworth, Sussex',
'pclass': '2nd',
'sex': 'male'}]


6、当关键字orient=’index’ 时

形成{index -> {column -> value}}的结构,调用格式正好和’dict’ 对应的反过来,请读者自己思考

data_index=data.to_dict(orient='index')

data_index
Out[43]:
{12: {'age': 31.19418104265403,
'embarked': 'Cherbourg',
'home.dest': 'Paris, France',
'pclass': '1st',
'sex': 'female'},
437: {'age': 48.0,
'embarked': 'Southampton',
'home.dest': 'Somerset / Bernardsville, NJ',
'pclass': '2nd',
'sex': 'female'},
507: {'age': 31.19418104265403,
'embarked': 'Southampton',
'home.dest': 'Petworth, Sussex',
'pclass': '2nd',
'sex': 'male'},
562: {'age': 41.0,
'embarked': 'Cherbourg',
'home.dest': 'New York, NY',
'pclass': '2nd',
'sex': 'male'},
663: {'age': 26.0,
'embarked': 'Southampton',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
669: {'age': 19.0,
'embarked': 'Southampton',
'home.dest': 'England',
'pclass': '3rd',
'sex': 'male'},
833: {'age': 32.0,
'embarked': 'Southampton',
'home.dest': 'Foresvik, Norway Portland, ND',
'pclass': '3rd',
'sex': 'male'},
1036: {'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
1086: {'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'},
1108: {'age': 31.19418104265403,
'embarked': 'UNKNOWN',
'home.dest': 'UNKNOWN',
'pclass': '3rd',
'sex': 'male'}}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  to-dict-用法