论文复现报告:Deep Region and Multi-label Learning for Facial Action Unit Detection
2017-12-20 19:47
1056 查看
论文复现报告:Deep
Region and Multi-label Learning for Facial Action UnitDetection
第一次跑caffe代码,纯属自我练习,顺便记下一点心得和发现。此次复现参考自http://blog.csdn.net/u011668104/article/details/77412332感谢前人的分享。严格来讲我跑了三遍caffe。
1. 第一遍使用标签{-1,1},即所有正样本的标签是1,负样本的标签是-1.复现结果如下:
图 1标签为{-1,1}时的loss变化趋势
以及相应的F1率:
图 2 标签为{-1,1}时的F1率
可以看到最终的loss值在3.4左右震荡,而F1率平均只有0.44。
2. 第二遍,我尝试把标签{-1,1}改成{0,1},因为我觉得既然loss函数是用sigmoid来定义的,sigmoid的输出是区间是(0,1)。那是否标签使用{0,1}会更好些,然后也可以直接使用caffe系统自带的交叉熵损失函数。第二遍结果如下:
图 3标签为{0,1}时的loss变化趋势
图 4标签为{0,1}时的F1率
虽然F1率略有上升,但后来经过实验室师兄师姐hqp&mcn的点拨,其实标签{-1,1}和标签{0,1}本质上是一致的。说明如下:
对于标签{0,1},代码中的loss函数定义是:
for (int i = 0; i < count; ++i) {
loss -= input_data[i] * (target[i] - (input_data[i] >= 0)) -
log(1 + exp(input_data[i] - 2 * input_data[i] * (input_data[i] >= 0)));
}
对于标签{-1,1},代码中的loss函数是
for (int i = 0; i < count; ++i) {
if (target[i] != 0){
loss -= input_data[i] * ((target[i] > 0) - (input_data[i] >= 0)) -
log(1 + exp(input_data[i] - 2 * input_data[i] * (input_data[i] >= 0)));;
}
可以证明对于任意input_data[i],loss函数值与标签值无关,只要激活函数的输出值与标签值相等时loss值等于0就可以了。
那既然{-1,1}和{0,1}本质上一样,为什么作者要使用{-1,1}这种标签呢,这里取1,-1只是为了添加0这个标签来平衡正负样本比例:
图 5 各AU的正负比例
3. 第三遍,我将第一遍中的标签进行平衡,使1和-1的个数相等,随机地把1或-1改成0,改动结果如下:
实验结果为:
图 6 标签为{-1,0,1}时的loss变化趋势
图 7 标签为{-1,0,1}时的F1率
Loss值在2.5左右震荡,F1率也有了提高,说明平衡标签确实起到了效果。
值得一提的是,我在跑第三遍的时候loss函数居然发散了,后来才发现原来是train.prototxt下的loss函数错了。虽然我之前明明把caffe自带的交叉熵函数移出了文件夹,也以为type变量映射的是hpp文件里的virtual inline const char* type() const { return"SigmoidCrossEntropyLoss"; },其实不然,映射的应该是cpp文件里的REGISTER_LAYER_CLASS(MultiSigmoidCrossEntropyLoss);
相关文章推荐
- Deep Region and Multi-label Learning for Facial Action Unit Detection简要论文笔记
- Joint Patch and Multi-label Learning for Facial Action Unit Detection
- 【笔记】Action Unit Detection with Region Adaptation Multi-labeling Learning and Optimal Temporal Fusing
- Multi-conditional Latent Variable Model for Joint Facial Action Unit Detection
- 论文笔记 A Large Contextual Dataset for Classification,Detection and Counting of Cars with Deep Learning
- 论文笔记:Research and Implementation of a Multi-label Learning Algorithm for Chinese Text Classification
- 论文笔记之:Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
- 论文阅读:End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for H
- 【论文笔记】Deep Direct Regression for Multi-Oriented Scene Text Detection
- Region-based Convolutional Networks for Accurate Object Detection and Segmentation----R-CNN论文笔记
- 论文笔记之:Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning
- 论文学习(CVPR2018):Multi-Oriented Scene Text Detection via Coner Localization and Region Segmentation
- 论文笔记之---Joint Detection and Identification Feature Learning for Person Search
- 论文翻译:Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
- 【论文笔记】Region-based Convolutional Networks for Accurate Object Detection and Segmentation
- DeepID-Net:multi-stage and deformable deep CNNs for object detection
- 【论文笔记】From Facial Parts Responses to Face Detection: A Deep Learning Approach
- PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning --- 论文笔记
- DeepID-Net:multi-stage and deformable deep convolutional neural network for object detection
- 论文阅读(Xiang Bai——【TIP2014】A Unified Framework for Multi-Oriented Text Detection and Recognition)