您的位置：首页 > 产品设计 > UI/UE

deeplearning论文学习笔记（2）A critical review of recurrent neural networks for sequence learning

2016-10-26 10:25 931 查看

introduce

这些天在学习循环神经网络RNN，找到了这篇综述来学习下

这篇paper截止到现在google scholar被引用量30

下载地址

1.2 Why not use Markov models?

2.2 Neural networks

For multiclass classication with K alternative classes, we apply a softmax nonlinearity in an output layer of K nodes.

For multilabel classication the activation function is simply a

point-wise sigmoid

For regression we typically have linear output.

对于 multiclass classication和 multilabel classication的区别不太了解，在scikit-learn的官网上找了一段解释

The sklearn.multiclass module implements meta-estimators to solve multiclass and multilabel classification problems by decomposing such problems into binary classification problems. Multitarget regression is also supported.

Multiclass classification means a classification task with more than two classes; e.g., classify a set of images of fruits which may be oranges, apples, or pears. Multiclass classification makes the assumption that each sample is assigned to one and only one label: a fruit can be either an apple or a pear but not both at the same time.

Multilabel classification assigns to each sample a set of target labels. This can be thought as predicting properties of a data-point that are not mutually exclusive, such as topics that are relevant for a document. A text might be about any of religion, politics, finance or education at the same time or none of these.

Multioutput regression assigns each sample a set of target values. This can be thought of as predicting several properties for each data-point, such as wind direction and magnitude at a certain location.

Multioutput-multiclass classification and multi-task classification means that a single estimator has to handle several joint classification tasks. This is both a generalization of the multi-label classification task, which only considers binary classification, as well as a generalization of the multi-class classification task. The output format is a 2d numpy array or sparse matrix.

The set of labels can be different for each output variable. For instance, a sample could be assigned “pear” for an output variable that takes possible values in a finite set of species such as “pear”, “apple”; and “blue” or “green” for a second output variable that takes possible values in a finite set of colors such as “green”, “red”,

“blue”, “yellow”… This means that any classifiers handling multi-output multiclass or multi-task classification tasks, support the multi-label classification task as a special case. Multi-task classification is similar to the multi-output classification task with different model formulations. For more information, see the relevant estimator documentation.

2.3 Feedforward networks and backpropagation

在反向计算的优化算法中，在SGD的基础上，有了很多改进

Many variants of SGD are used to accelerate learning. Some popular heuristics, such as AdaGrad [Duchi et al., 2011], AdaDelta [Zeiler, 2012], and RMSprop [Tieleman and Hinton, 2012], tune the learning rate adaptively for each feature.

这篇博客介绍了一些优化的方法：

各种优化方法总结比较（sgd/momentum/Nesterov/adagrad/adadelta）

同时还有一篇英文的介绍各种优化方法：

An overview of gradient descent optimization algorithms

这篇文章介绍的优化算法更全

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航