您的位置：首页 > 其它

[深度学习论文笔记][Weight Initialization] Data-dependent Initializations of Convolutional Neural Networks

2016-09-21 14:25 736 查看

Krhenbhl, Philipp, et al. “Data-dependent initializations of convolutional neural networks.” arXiv preprint arXiv:1511.06856 (2015). [Citations: 10].

1 Attenuating Activations

[Idea] Initialize W ’s such that activations have unit variance.

[Algorithm] See Alg. 1

[Analysis] This algorithm works for nonlinearities (such as ReLU and pooling).

• The different channels will undergo the same transformation.

• Then the output channels will follow the same distribution if the input channels do.

• The changed variance will be fixed by the next layer.

2 Global Scaling

[Motivation] The output follow the same distribution per layer. But what about the global scaling of the whole net.

[Idea] We want the weights in each layer to change at roughly the same rate. I.e., we want

[Algorithm] See Alg. 2

3 References

[1]. J. Donahue. https://cs.stanford.edu/~jhoffman/yahooJapan_Mar2016_talks/3-jeff_talk.pdf.
[2]. T. Koller. http://lmb.informatik.uni-freiburg.de/lectures/seminar_brox/seminar_ss16/Data_dependent_initializations_split.pdf.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： CNN Computer Vision Deep Learning Papers Initialization

相关文章推荐

新的分享

章节导航