您的位置：首页 > 理论基础 > 计算机网络

深度神经网络学习资料整理

2013-09-04 14:19 501 查看

一一个tutorial的概括转自：http://blog.csdn.net/txdb/article/details/6766373

前2天看到新闻说，用微软用深度神经网络大幅度提高了语言识别的正确率

http://research.microsoft.com/en-us/news/features/speechrecognition-082911.aspx

于是对深度学习有了兴趣，由于没看过微软的文章，所以还不知道是不是用的类似技术。

这位大哥对deep learning 有研究。http://fantasticinblur.iteye.com/blog/1131640

这个网站似乎很好http://deeplearning.net/。还没仔细看。

大师的网址hinton ：http://www.cs.toronto.edu/~hinton/

其中有个tutorial ，还带代码，可惜代码是python的。试了下，没这么快能熟练使用python。

这个tuturial主要讲了以下几个方面

1.logistic regression --逻辑回归也是一种线性回归，是一种归一化的线性回归。归一化的方法是sigmoid函数，即s型函数。既然是线性回归，就是建立一个输入和输出的线性方程。输入是n维的，就是一个n元的一次方程，也就有了n+1个参数，+1是因为还有个偏置值。输出是(0,1)之间的数。根据标定样本，确定最符合的参数值，就可以用这个n元一次方程来直接计算测试样本的输出值f。如果是用做分类，则要稍微修改下。比如用做2类分类，可以确定f>0.5则为第1类，否则为第2类。问题是如何修改这些参数使得结果对所有测试样本都成立（或者说尽可能多的样本成立）。这里有损失函数的概念。在说下去就复杂了。差不多可以构建一个单层多元神经网络了。这里有两个函数，一个是softmax，一个是argmax，以前没看到过这种表述，

softmax是柔性化最大值，是一种归一操作，使得输出在（0,1）之间。举例来说a=3,b=6,c=9. max=9,softmax=9/(3+6+9)=0.5,argmax其实是取得max时的arg，argmax=c。

参考链接：logistic regression http://hi.baidu.com/hehehehello/blog/item/0b59cd803bf15ece9023d96e.html
2.multilayer perceptron--单层自然是不够用的。所以有必要要多层神经网络，个人觉得bp简直太神了，以至于我现在都觉得ai就是神经网络的架构+bp的问题。bp解决了反馈的传播问题。

3.deep convolutional networks--卷积网络我是熟悉的。这里讲深层卷积网络，给的例子是lenet5的。还真不知道也就算深层的了。lenet的共享权值很好用，结合感受野，就一定程度上有了特征提取的能力。我一直在想，如果卷积可以是各种形状的，是不是会更强大。

下面的内容，不是很熟悉。虽然以前看过hopfield 网络，但也没理解的很透彻。重新从hopfield开始看起。The Hopfield Model，这个pdf还不错，对hopfield从浅到深讲的很明白，也讲了hopfield的一些著名应用。不知道是哪本书的章节。

sgn函数，step函数

The symmetry of the weight matrix and a zero diagonal are thus necessary

conditions for the convergence of an asynchronous totally connected network

to a stable state. These conditions are also sufficient, as we show later.

突然想到hopfield网络的连接结构和mlp的不同。人的大脑皮层有很多层，hopfield模拟的是一种层内的结构，mlp则是层间的。hopfield适合做模式记忆，而mlp则有抽象的意味。接下去看BM和RBM。
简单的说BM是结合模拟退火算法的Hopfield网络。另外BM中有可见层，隐藏层。不过BM的应用范围似乎和Hopfield有点不一样。而RBM是对BM的结构做了一定的限制的BM。
A restricted Boltzmann machine (Smolensky, 1986) consists of a layer of visible units and a layer of hidden units with no visible-visible or hidden-hidden connections.
具体可以看http://www.scholarpedia.org/article/Boltzmann_machine，，不过hinton1985年的那个文章更详细。BM学习算法中如何求<sisj>model（pij）还不是很清楚。有时间结合代码再看看。the learning signal is very noisy because it
is the difference of two sampled expectations按这句话得提示，结合其他文字，似乎是说pij是在一个温度T恒=1的

过程中去求，。。。还是不明白。。where <.>data is an expected value in the data distribution and<.>model is an expected value when the Boltzmann machine is sampling state vectors from its equilibrium distribution at a temperature of 1.

下面要看的是contrastive divergence，Gibbs sampleing，sigmoid belief networks。

4.auto encoders

5.stacked denoising auto encoder

6.restricted boltzmann machine:

7.Deep belief networks

二资料---论文相关

转自：http://fantasticinblur.iteye.com/blog/1131640

毕设做的是DBNs的相关研究，翻过一些资料，在此做个汇总。

可以通过谷歌学术搜索来下载这些论文。

Arel, I., Rose, D. C. and K
arnowski, T. P. Deep machine learning - a new frontier in artificial intelligence research. Computational Intelligence Magazine, IEEE, vol. 5, pp. 13-18, 2010.

深度学习的介绍性文章，可做入门材料。

Bengio, Y. Learning deep architecture for AI. Foundations and Trends in Machine Learning, vol. 2, pp: 1-127, 2009.

深度学习的经典论文，集大成者。可以当作深度学习的学习材料。

Hinton, G. E. Learning multiple layers of representation. Trends in Cognitive Sciences, vol. 11, pp. 428-434, 2007.

不需要太多数学知识即可掌握DBNs的关键算法。这篇论文语言浅白，篇幅短小，适合初学者理解DBNs。

Hinton, G. E. To recognize shapes, first learn to generate images. Technical Report UTML TR 2006-003, University of Toronto, 2006.

多伦多大学的内部讲义。推荐阅读。

Hinton, G. E., Osindero, S. and Teh, Y. W. A fast learning algorithm for deep belief nets. Neural Computation, vol 18, pp. 1527-1554, 2006.

DBNs的开山之作，意义非凡，一定要好好看几遍。在这篇论文中，作者详细阐述了DBNs的方方面面，论证了其和一组层叠的RBMs的等价性，然后引出DBNs的学习算法。

Hinton, G. E. and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science, vol. 313, no. 5786, pp. 504–507, 2006.

Science上的大作。这篇论文可是算作一个里程碑，它标志着深度学习总算有了高效的可行的算法。

Hinton, G. E. A practical guide to training restricted boltzmann machines. Technical Report UTML TR 2010-003, University of Toronto, 2010.

一份训练RBM的最佳实践。

Erhan, D., Manzagol, P. A., Bengio, Y., Bengio, S. and Vincent, P. The difficulty of training deep architectures and the effect of unsupervised pretraining. In The Twelfth
International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 153–160, 2009.

Erhan, D., Courville, A., Bengio, Y. and Vincent, P. Why Does Unsupervised Pre-training Help Deep Learning? In the 13th International Conference on Artificial Intelligence
and Statistics (AISTATS), Chia Laguna Resort, Sardinia, Italy, 2010.

阐述了非监督预训练的作用。这两篇可以结合起来一起看。

这篇博客给出的材料更加全面，作者来自复旦大学，现似乎是在Yahoo Labs北京研究院工作。

http://demonstrate.ycool.com/post.3006074.html

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航