您的位置:首页 > 编程语言 > Python开发

python 取出 Mongdb 中的数据 转化成DataFrame 然后用pandas处理数据

2017-08-16 11:29 1016 查看
这段时间再玩python ,数据源来源于mongdb ,数据处理方式用的是pandas

刚开始是用的一个比较麻烦的转化,直接上代码:

方法一:

import pandas as pd
from pymongo import MongoClient

client = MongoClient('192.168.1.5',10070)

db = client.dbtest

collection=db.data_table
items = collection.find()

dateId = []
ai_type = []
ai_name = []
quorum = []
priceUSD = []
ai_disageform = []
country = []
continent  = []
company = []
ai_cap_tr = []
n = 0
for i in items:
n= n+1
print("正在输出 %s 条"%n)
keys = i.keys()
if 'ai_disageform' in keys:
ai_disageform.append(i['ai_disageform'])
else:
ai_disageform.append('')
if 'date' in keys:
t = str(i['date'])
dateId.append(t[:10])
else:
dateId.append('')
if 'ai_type' in keys:
ai_type.append(i['ai_type'])
else:
ai_type.append('')
if 'continent' in keys:
continent.append(i['continent'])
else:
continent.append('')
if 'quorum' in keys:
quorum.append(i['quorum'])
else:
quorum.append('')
if 'priceUSD' in keys:
priceUSD.append(i['priceUSD'])
else:
priceUSD.append('')
if 'country' in keys:
country.append(i['country'])
else:
country.append('')
if 'ai_name' in keys:
ai_name.append(i['ai_name'])
else:
ai_name.append('')
if 'company' in keys:
company.append(i['company'])
else:
company.append('')
if 'ai_cap_tr' in keys:
ai_cap_tr.append(i['ai_cap_tr'])
else:
ai_cap_tr.append('')

df = pd.DataFrame({'dateId':dateId,
'ai_type':ai_type,
'ai_name':ai_name,
'quorum':quorum,
'priceUSD':priceUSD,
'ai_disageform':ai_disageform,
'country':country,
'continent':continent,
'ai_cap_tr':ai_cap_tr,
'company':company})

df.to_csv('../ncbdata/b.csv', encoding = "utf-8",index=None)


具体思路:经测验,每条记录是dict类型的,将每个键里的值放到不同的数组中,然后创建dataframe对象。

方法二:

import pandas as pd
import numpy as np
import  pymongo
from pymongo import MongoClient
import json

#连接mongdb
def connectMongdb():

client = MongoClient('192.168.1.5',10070)

db = client.dbtest

collection = db.data_table
items = collection.find()
return items

#转化为df
def tran_df():
items = connectMongdb()
temp = []
for dict in items:
del dict['_id']
dict['date'] = dict['date'].strftime("%Y-%m-%d")
temp.append(dict)
data_employee = pd.read_json(json.dumps(temp))
data_employee_ri = data_employee.reindex(columns=['date', 'ai_type', 'ai_name'])
data_employee_ri.to_csv('data/a.csv')

def main():
tran_df()

if __name__ == "__main__":
main()


具体思路:将每一个字典放到一个数组里,然后通过read_json() 方法转化为df对象。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python mongodb