您的位置：首页 > 编程语言 > Python开发

Python时间序列LSTM预测系列教程（7）-多变量

2017-09-07 11:25 2356 查看

多变量LSTM预测模型（1）

空气污染预测案例

数据格式

No	year	month	day	hour	pm2.5	DEWP	TEMP	PRES	cbwd	Iws	Is	Ir
1	2010	1	1	0	NA	-21	-11	1021	NW	1.79	0	0
2	2010	1	1	1	NA	-21	-12	1020	NW	4.92	0	0
3	2010	1	1	2	NA	-21	-11	1019	NW	6.71	0	0
4	2010	1	1	3	NA	-21	-14	1019	NW	9.84	0	0
5	2010	1	1	4	NA	-20	-12	1018	NW	12.97	0	0
6	2010	1	1	5	NA	-19	-10	1017	NW	16.1	0	0
7	2010	1	1	6	NA	-19	-9	1017	NW	19.23	0	0
8	2010	1	1	7	NA	-19	-9	1017	NW	21.02	0	0
9	2010	1	1	8	NA	-19	-9	1017	NW	24.15	0	0

No: 行标

year

month

day

hour

pm2.5: PM2.5 浓度

DEWP: 零点温度

TEMP: 温度

PRES: 压力

cbwd: 结合风向

Iws: 累积风速

Is: 累积雪量

Ir: 累积雨量

数据准备

1、删除“No”列，因为没用
2、删除最开始的24小时数据，因为PM值全是NA
3、将之后的数据中出现的NA全部替换成0

预处理

# coding=utf-8

from pandas import read_csv
from pandas import datetime

def parser(x):
return datetime.strptime(x, '%Y %m %d %H')

dataset = read_csv('data_set/air_pollution.csv', parse_dates=[['year', 'month', 'day', 'hour']], index_col=0, date_parser=parser)
dataset.drop('No', axis=1, inplace=True)#axis=1,删除列；inplace=True,直接在原DataFrame上执行删除

#手动设置每一列的label
dataset.columns = ['pollution', 'dew', 'temp', 'press', 'wnd_dir', 'wnd_spd', 'snow', 'rain']
dataset.index.name = 'date'
#将NA替换为0
dataset['pollution'].fillna(0, inplace=True)
#删除最开始的24条数据
dataset = dataset[24:]
print dataset.head()

#保存处理后数据
dataset.to_csv('data_set/air_pollution_new.csv')

数据输出

# coding=utf-8
#输出数据曲线
#------------
from pandas import read_csv
from matplotlib import pyplot

dataset =  read_csv('data_set/air_pollution_new.csv', header=0, index_col=0)
values = dataset.values

#需要输出的列
groups = [i for i in range(8)]
groups.remove(4)#删除值4，因为是字符串

i=1
#输出列曲线图
pyplot.figure()
for group in groups:
pyplot.subplot(len(groups), 1, i)#创建len(gourps)行，1列的子图，表示在第i个子图画图
pyplot.plot(values[:,group])
pyplot.title(dataset.columns[group], y=0.5, loc='right')
i+=1
pyplot.show()

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 时间序列预测 Python RNN LSTM

相关文章推荐

新的分享

章节导航