您的位置:首页 > 其它

[深度学习论文笔记][Weight Initialization] Data-dependent Initializations of Convolutional Neural Networks

2016-09-21 14:25 736 查看
Krhenbhl, Philipp, et al. “Data-dependent initializations of convolutional neural networks.” arXiv preprint arXiv:1511.06856 (2015). [Citations: 10].

1 Attenuating Activations

[Idea] Initialize W ’s such that activations have unit variance.

[Algorithm] See Alg. 1



[Analysis] This algorithm works for nonlinearities (such as ReLU and pooling).

• The different channels will undergo the same transformation.

• Then the output channels will follow the same distribution if the input channels do.

• The changed variance will be fixed by the next layer.

2 Global Scaling

[Motivation] The output follow the same distribution per layer. But what about the global scaling of the whole net.

[Idea] We want the weights in each layer to change at roughly the same rate. I.e., we want



[Algorithm] See Alg. 2



3 References

[1]. J. Donahue. https://cs.stanford.edu/~jhoffman/yahooJapan_Mar2016_talks/3-jeff_talk.pdf.
[2]. T. Koller. http://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss16/Data_dependent_initializations_split.pdf.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
相关文章推荐