[深度学习论文笔记][Weight Initialization] Data-dependent Initializations of Convolutional Neural Networks
2016-09-21 14:25
736 查看
Krhenbhl, Philipp, et al. “Data-dependent initializations of convolutional neural networks.” arXiv preprint arXiv:1511.06856 (2015). [Citations: 10].
1 Attenuating Activations
[Idea] Initialize W ’s such that activations have unit variance.
[Algorithm] See Alg. 1
[Analysis] This algorithm works for nonlinearities (such as ReLU and pooling).
• The different channels will undergo the same transformation.
• Then the output channels will follow the same distribution if the input channels do.
• The changed variance will be fixed by the next layer.
2 Global Scaling
[Motivation] The output follow the same distribution per layer. But what about the global scaling of the whole net.
[Idea] We want the weights in each layer to change at roughly the same rate. I.e., we want
[Algorithm] See Alg. 2
3 References
[1]. J. Donahue. https://cs.stanford.edu/~jhoffman/yahooJapan_Mar2016_talks/3-jeff_talk.pdf.
[2]. T. Koller. http://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss16/Data_dependent_initializations_split.pdf.
1 Attenuating Activations
[Idea] Initialize W ’s such that activations have unit variance.
[Algorithm] See Alg. 1
[Analysis] This algorithm works for nonlinearities (such as ReLU and pooling).
• The different channels will undergo the same transformation.
• Then the output channels will follow the same distribution if the input channels do.
• The changed variance will be fixed by the next layer.
2 Global Scaling
[Motivation] The output follow the same distribution per layer. But what about the global scaling of the whole net.
[Idea] We want the weights in each layer to change at roughly the same rate. I.e., we want
[Algorithm] See Alg. 2
3 References
[1]. J. Donahue. https://cs.stanford.edu/~jhoffman/yahooJapan_Mar2016_talks/3-jeff_talk.pdf.
[2]. T. Koller. http://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss16/Data_dependent_initializations_split.pdf.
相关文章推荐
- [深度学习论文笔记][Adversarial Examples] Intriguing properties of neural networks
- [深度学习论文笔记][Semantic Segmentation] Recurrent Convolutional Neural Networks for Scene Labeling
- [深度学习论文笔记][Video Classification] Large-scale Video Classification with Convolutional Neural Networks
- [深度学习论文笔记][Image Classification] ImageNet Classification with Deep Convolutional Neural Networks
- [深度学习论文笔记][Image Classification] Very Deep Convolutional Networks for Large-Scale Image Recognitio
- [深度学习论文笔记][Visualizing] Visualizing and Understanding Convolutional Networks
- [深度学习论文笔记] Convolutional Neuron Networks and its Applications
- [深度学习论文笔记][Human Pose Estimation] DeepPose: Human Pose Estimation via Deep Neural Networks
- 论文笔记 Ensemble of Deep Convolutional Neural Networks for Learning to Detect Retinal Vessels in Fundus
- [深度学习论文笔记][Video Classification] Long-term Recurrent Convolutional Networks for Visual Recognition a
- 深度学习论文笔记(六)--- FCN-2015年(Fully Convolutional Networks for Semantic Segmentation)
- deeplearning论文学习笔记(2)A critical review of recurrent neural networks for sequence learning
- [深度学习论文笔记][Neural Arts] A Neural Algorithm of Artistic Style
- 【深度学习论文笔记:Recognition】:Deep Neural Networks for Object Detection
- [深度学习论文笔记][Weight Initialization] Understanding the difficulty of training deep feedforward neural
- [深度学习论文笔记][Semantic Segmentation] Fully Convolutional Networks for Semantic Segmentation
- [深度学习论文笔记][Video Classification] Two-Stream Convolutional Networks for Action Recognition in Videos
- deeplearning论文学习笔记(1)Convolutional Neural Networks for Sentence Classification
- [深度学习论文笔记][Neural Arts] Inceptionism: Going Deeper into Neural Networks