您的位置：首页 > 其它

愉快的学习就从翻译开始吧_Multi-step Time Series Forecasting_9_Multi-Step LSTM Network_Fit LSTM Network

2018-06-19 00:20 471 查看

Fit LSTM Network/拟合LSTM网络

Next, we need to fit an LSTM network model to the training data.接下来，我们需要将LSTM网络模型拟合到训练数据中。
This first requires that the training dataset be transformed from a 2D array [samples, features] to a 3D array [samples, timesteps, features]. We will fix time steps at 1, so this change is straightforward.首先需要把训练数据集从二维数组【样品，特征】转换到三维数组【样品，时间步，特征】，我们将把时间步固定为1，所以这个改变非常简单。Next, we need to design an LSTM network. We will use a simple structure with 1 hidden layer with 1 LSTM unit, then an output layer with linear activation and 3 output values. The network will use a mean squared error loss function and the efficient ADAM optimization algorithm.接下来，我们需要设计一个LSTM网络。我们将使用一个具有1个隐藏层和1个LSTM单元的简单结构，然后是一个具有线性激活和3个输出值的输出层。网络将使用均方误差损失函数和高效的ADAM优化算法。The LSTM is stateful; this means that we have to manually reset the state of the network at the end of each training epoch. The network will be fit for 1500 epochs.LSTM是有状态的;这意味着我们必须在每个训练epoch结束时手动重置网络状态。该网络将适合1500个epochs。
The same batch size must be used for training and prediction, and we require predictions to be made at each time step of the test dataset. This means that a batch size of 1 must be used. A batch size of 1 is also called online learning as the network weights will be updated during training after each training pattern (as opposed to mini batch or batch updates).必须使用相同的batch size进行训练和预测，并且我们需要在测试数据集的每个时间步进行预测。这意味着必须使用批量大小1。批量大小1也称为在线学习，因为网络权重将在每个训练对（而不是小批量或批量更新）之后的训练期间更新。
We can put all of this together in a function called fit_lstm(). The function takes a number of key parameters that can be used to tune the network later and the function returns a fit LSTM model ready for forecasting.我们可以将所有这些放在一个名为fit_lstm（）的函数中。该函数需要一些关键参数，可用于稍后调整网络，并且该函数返回一个准备好的用于预测的拟合的LSTM模型。

123456789101112131415

# fit an LSTM network to training datadef fit_lstm(train, n_lag, n_seq, n_batch, nb_epoch, n_neurons): # reshape training into [samples, timesteps, features] X, y = train[:, 0:n_lag], train[:, n_lag:] X = X.reshape(X.shape[0], 1, X.shape[1]) # design network model = Sequential() model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2]), stateful=True)) model.add(Dense(y.shape[1])) model.compile(loss='mean_squared_error', optimizer='adam') # fit network for i in range(nb_epoch): model.fit(X, y, epochs=1, batch_size=n_batch, verbose=0, shuffle=False) model.reset_states() return model

The function can be called as follows:

12	# fit modelmodel = fit_lstm(train, 1, 3, 1, 1500, 1)

The configuration of the network was not tuned; try different parameters if you like.网络的配置没有调整; 如果你喜欢，尝试不同的参数如果你喜欢。
Report your findings in the comments below. I’d love to see what you can get.在下面的评论中报告你的发现。我很想看看你能得到什么。

机器学习-Keras之stateful LSTM全面解析+实例测试

Keras中的stateful LSTM可以说是所有学习者的梦魇，令人混淆的机制，说明不到位的文档，中文资料的匮乏。注意，此处的状态表示的是原论文公式里的c，h，即LSTM特有的一些记忆参数，并非w权重。在stateless时，长期记忆网络并不意味着你的LSTM将记住之前 batch 的内容。当我们在默认状态 stateless 下，Keras会在训练每个sequence小序列（=sample）开始时，将LSTM网络中的记忆状态参数reset初始化（指的是 c，h 而并非权重 w ），即调用 model.reset_states() 。为啥stateless LSTM每次训练都要初始化记忆参数? 因为Keras在训练时会默认地 shuffle samples ，所以导致 sequence 之间的依赖性消失， sample 和 sample 之间就没有时序关系，顺序被打乱，这时记忆参数在 batch 、小序列之间进行传递就没意义了，所以Keras要把记忆参数初始化。无论是stateful还是stateless，都是在模型接受一个 batch 后，计算每个sequence的输出，然后平均它们的梯度，反向传播更新所有的各种参数。https://www.toutiao.com/a6532553094650135044/

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航