您的位置:首页 > 理论基础 > 计算机网络

一步一步分析讲解神经网络基础-backpropagation algorithm

2017-12-22 18:01 686 查看
backpropagation算法 neutral network的基础。需要要掌握的基础知识。理解地方。我用红色字体输出。

The project describes teaching process of multi-layer neural network employing backpropagation algorithm. To illustrate this process the three layer neural network with two inputs and one output,which is shown in the picture below, is used: 很明显一共有三层网络,两个输入,一个输出。



Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realise nonlinear function, called neuron activation function. Signal e is adder output signal, and y = f(e) is output signal of nonlinear element. Signal y is also output signal of neuron.

一个单独的神经元,加权求和得到e,选取一个非线性的激活函数f(e) 输出y值。激活函数请看另外一篇blog。



To teach the neural network we need training data set. The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) z. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Modification is calculated using algorithm described below: Each teaching step starts with forcing both input signals from training set. After this stage we can determine output signals values for each neuron in each network layer. Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n.

1,我们需要训练的数据集,包含期望值。x1 ,x2,结果z.

2,神经网络训练是一个反复迭代的过程。

3,在每次迭代中,会从新在训练集中取新的数据。

4,w(xm)n代表权重, xm代表输入,yn代表第n个神经元输出,注意下标。

5,下图很容易看懂啦。



4000





Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer.

注意观察yn表示第n个神经元输出,相当于xn的地位。





Propagation of signals through the output layer.

以此类推



In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal d of output layer neuron.

1,z是目标,y神经元的输出,差值为error signal。

2,我们得到error signal,之后一直困扰神经网络发展很多年。不知道如何使用error signal,只到BP思想的提出。



It is impossible to compute error signal for internal neurons directly, because output values of these neurons are unknown. For many years the effective method for training multiplayer networks has been unknown. Only in the middle eighties the backpropagation algorithm has been worked out. The idea is to propagate error signal d (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.

1,在网络内部直接计算error signal的值是不可能的,因为网络输出值是unkown,就是y未知。

2,多年以来训练多层神经网络的方法也是unknown。只到20世纪80年代,BP算法计算出来。

3,算法的思想:error signal d ,反向传播到所有信号输出的神经元。

4,保持权重不变,将误差通过加权求和反向传播回每个神经元。





The weights’ coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is used for all network layers. If propagated errors came from few neurons they are added. The illustration is below:

1,权重不变。

2,误差应用于所有神经元。







When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified).

1,所有神经元的误差都被计算出来。

2,注意观察,权重将会被修改。

3,df(e)/de激活函数偏导数,乘以步长,加上上一步的权重,等于新的权重值。

4,ok,为什么要这样操作?通过改变权重,达到减小error signal值。

5,其实error signal可以表示成一个关于待求自变量权重的复合函数。我们称这个权重复合函数为cost function。一般情况下,cf为非凸函数,并且伴随多个局部最小值。

6,如何逼近cf的最小值。我们一般的做法是求每个自变量权重的偏导数。就是复合函数链式求导法则。寻找到梯度为负的方向,乘以步长。找到一个能使cf最快下降的点。在加上上一步的权重得到新的权重。

7,以此类推去计算每一个新的权重,会发现最后的error signal变小一点。

8,重复以上传播过程。error signal 会逼近于0。抛开数据性质来说,就是一个万能逼近器。













Coefficient h affects network teaching speed. There are a few techniques to select this parameter. The first method is to start teaching process with large value of the parameter. While weights coefficients are being established the parameter is being decreased gradually. The second, more complicated, method starts teaching with small parameter value. During the teaching process the parameter is being increased when the teaching is advanced and then decreased again in the final stage. Starting teaching process with low parameter value enables to determine weights coefficients signs.

1,步长的选择,会影响训练的速度。

2,第一种方法,以大数开始,训练过程中逐渐减少步长。

3,第二种方法,以小数开始,训练过程中先增加,在减小步长。

完。

一家之言,如有不妥之处,欢迎留言。

References

Ryszard Tadeusiewcz “Sieci neuronowe”, Kraków 1992。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐