您的位置：首页 > 其它

deeplearning----学习一个简单的分类器

2015-04-01 09:48 274 查看

1、零一损失

我们的目的就是让错误次数（零一损失）尽可能的少：

f（x）会得出在当前的theata条件下输入对应的最大概率的输出值。换言之，我们从x预测出f(x),如果这个值就是y，那么预测成功，反之失败。

# zero_one_loss is a Theano variable representing a symbolic
# expression of the zero one loss ; to get the actual value this
# symbolic expression has to be compiled into a Theano function (see
# the Theano tutorial for more details)
zero_one_loss = T.sum(T.neq(T.argmax(p_y_given_x), y))
#neq是I函数，T.neq(x,y)判断两个值是否不相等，not equal？
2、负对数自然损失

由于0-1损失是不可微的,在大型模型中去优化它相当耗费资源,因此我们最大化它的对数似然函数来完成（似然就是可能性）:

也就是最小化负对数似然损失

负对数似然函数：negative log-likelihood (NLL)

# NLL is a symbolic variable ; to get the actual value of NLL, this symbolic
# expression has to be compiled into a Theano function (see the Theano
# tutorial for more details)
NLL = -T.sum(T.log(p_y_given_x)[T.arange(y.shape[0]), y])
# note on syntax: T.arange(y.shape[0]) is a vector of integers [0,1,2,...,len(y)].
# Indexing a matrix M by the two vectors [0,1,...,K], [a,b,...,k] returns the
# elements M[0,a], M[1,b], ..., M[K,k] as a vector.  Here, we use this
# syntax to retrieve the log-probability of the correct labels, y.

3、随机梯度下降SGD(Stochastic Gradient Descent)

--------------------------------------------------------------------------------

# GRADIENT DESCENT

while True:

loss = f(params)

d_loss_wrt_params = ... # compute gradient

params -= learning_rate * d_loss_wrt_params

if <stopping condition is met>:

return params

上面是一般梯度下降，基本思路是：损失--》梯度--》参数更新

随机梯度下降是一次选几个样本进行训练。最简单的方式是一次一个：

随机梯度下降SGD(Stochastic Gradient Descent)

# STOCHASTIC GRADIENT DESCENT

for (x_i,y_i) in training_set:

                            # imagine an infinite generator

                            # that may repeat examples (if there is only a finite training set)

    loss = f(params, x_i, y_i)

    d_loss_wrt_params = ... # compute gradient

    params -= learning_rate * d_loss_wrt_params

    if <stopping condition is met>:

        return params

4、Minibatch SGD 除了一次使用多个样本，其他和sgd都一样

or (x_batch,y_batch) in train_batches:

                            # imagine an infinite generator

                            # that may repeat examples

    loss = f(params, x_batch, y_batch)

    d_loss_wrt_params = ... # compute gradient using theano

    params -= learning_rate * d_loss_wrt_params

    if <stopping condition is met>:

        return params

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航