您的位置:首页 > 编程语言 > Python开发

Python股票历史数据下载

2016-07-05 04:57 471 查看
股票历史数据下载和处理:下载--->hdfs--->hive--->oracle

----import_stock_d.py

#!/usr/bin/python

import tushare as ts

import os

import re

stocklistpath = '/home/cloudera/data/list/stocklist.txt'

savepath='/home/cloudera/data/data/'

openstock = open(stocklistpath,'r+')

for i in openstock:

    r = r"S[ZH]\d\d\d\d\d\d"

    stocklist = re.findall(r,i)

openstock.close()

for i in stocklist:

    stocknum = i[2:8]

    df = ts.get_hist_data(stocknum)

    if(i!=''):

        df.to_csv(savepath + i +'.txt')

    print i

问题:当股票数量太多时,运行时间超过一个小时后,会报错。但是不影响数据的下载,只是退出。我分别写了两个py来进行处理

-----import_stock_two.py-------------

#!/usr/bin/python

import os

for path,d,filelist in os.walk('/home/cloudera/data/data/'):

    for filename in filelist:

        filepath = os.path.join(path,filename)

        print filepath

        file = open(filepath,'r+')

        file.seek(0,0)

        filename1 = filename[0:8]+','

        print filename1

        for line in file.readlines():

            print file.writelines(filename1 + line)

        file.close()

-------------StockRun.sh--------------------------------------

#!/bin/sh

#python /home/cloudera/python/import_stock_d.py

python /home/cloudera/python/import_stock_two.py

hadoop fs -put /home/cloudera/data/data /stock

hive -e "LOAD DATA INPATH '/stock/data/*' OVERWRITE INTO TABLE import_stock_d"

hive -e "insert overwrite table import_stock_d select * from import_stock_d where length(code)=8"

sqoop  export --table import_stock_d  -connect jdbc:oracle:thin:@192.168.1.10:1521:orcl  --username stock --password stock     --export-dir '/user/hive/warehouse/import_stock_d/*'  --input-fields-terminated-by ',' --input-lines-terminated-by '\n' --columns
'code,T_DATE,OPEN,HIGH,CLOSE,LOW,VOLUME,PRICE_CHANGE,P_CHANGE,MA5,MA10,MA20,V_MA5,V_MA10,V_MA20,TURNOVER'

---------------------------------------------------------------------------
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: