【Recurrent Neural Network Regularization】读后感(未编辑完毕)
2017-09-11 21:40
323 查看
简介
正则化(Regularization)使神经网络应用广泛。对前馈神经网络,dropout是最有效的正则化方法。但是dropout不适用于RNN,因为递归(recurrence)会放大噪音,持续影响网络学习。RNN通常使用small model,large RNN趋于overfit。
这篇文章主要贡献是使用dropout在non-recurrent connections。如下图所示。在虚线箭头处应用dropout,在实线箭头处不应用。
1、LSTM units 长短记忆单元
首先RNN可以用表示为:
LSTM具体可以看这篇文章【Graves.Generating sequences with recurrent neural networks(2013)】
long-short term memory units(LSTM)包括输入门i (Input gate),输出门o(Output gate),遗忘门f(Forget gate),Cell等。
The “long term” memory is stored in a vector of memory cells。
公式为:
本文简化为:其中符号“圆圈内加一点”表示Hadamard product,即点乘。
而本文对公式的改进如下:
dropout操作给units加噪,使它们的intermediate computations 更鲁棒。我们不需要删除所有的units信息,因为units记录过去很多timesteps的events。以下是dropout实现用timestep t-2去预测timestep t+2
We can see that the information is corrupted by the dropout operator exactly L+1 times,and this number is independent of the number of timesteps traversed by the information. Standard
dropout perturbs the recurrent connections, which makes it difficult for the LSTM to learn to store information for long periods of time.
正则化(Regularization)使神经网络应用广泛。对前馈神经网络,dropout是最有效的正则化方法。但是dropout不适用于RNN,因为递归(recurrence)会放大噪音,持续影响网络学习。RNN通常使用small model,large RNN趋于overfit。
这篇文章主要贡献是使用dropout在non-recurrent connections。如下图所示。在虚线箭头处应用dropout,在实线箭头处不应用。
1、LSTM units 长短记忆单元
首先RNN可以用表示为:
LSTM具体可以看这篇文章【Graves.Generating sequences with recurrent neural networks(2013)】
long-short term memory units(LSTM)包括输入门i (Input gate),输出门o(Output gate),遗忘门f(Forget gate),Cell等。
The “long term” memory is stored in a vector of memory cells。
公式为:
本文简化为:其中符号“圆圈内加一点”表示Hadamard product,即点乘。
而本文对公式的改进如下:
dropout操作给units加噪,使它们的intermediate computations 更鲁棒。我们不需要删除所有的units信息,因为units记录过去很多timesteps的events。以下是dropout实现用timestep t-2去预测timestep t+2
We can see that the information is corrupted by the dropout operator exactly L+1 times,and this number is independent of the number of timesteps traversed by the information. Standard
dropout perturbs the recurrent connections, which makes it difficult for the LSTM to learn to store information for long periods of time.
相关文章推荐
- [Paper note] Gated Siamese Convolutional Neural Network Architecture for Human Re-Identification
- Look Closer to See Better Recurrent Attention Convolutional Neural Network for Fine-grained Image Re
- 【论文笔记】Recurrent Neural Network Regularization
- Convolutional Neural Network For Sentence Classification<Yoon Kim>解析(三)
- Principles of training multi-layer neural network using backpropagation
- Deep Saliency:Multi_Task Deep Neural Network Model for Salient object detection
- 如何对repater控件进行内容编辑
- dlnd-your-first-neural-network中反向传播的笔记
- 【Deep Learning学习笔记】NEURAL NETWORK BASED LANGUAGE MODELS FOR HIGHLY INFLECTIVE LANGUAGES_google2009
- Learning Attention for Online Advertising with Recurrent Neural Network论文思路整理
- 源自人脑的神奇算法 -- 读《How to make your own neural network》有感
- Make your own Neural NetWork之代码详解中
- 读论文《Attention and Augmented Recurrent Neural Network》
- Convolutional neural networks(CNN) (十二) Convolutional Neural Network Theory
- [coursera/dl&nn/week3]Shallow Neural Network(summary&question)
- 理解《Deep Forest: Towards An Alternative to Deep Neural Network》
- Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.3
- 论文笔记:TextBoxes: A Fast Text Detector with a Single Deep Neural Network
- 论文笔记-An Analysis of Deep Neural Network Models for Practical Applications
- Codeforces 852B Neural Network country