【Deep Learning】Review:Rich feature hierarchies for accurate object detection and semantic segmentati
2016-01-25 09:14
489 查看
Rich feature hierarchies for accurate object detection and semantic segmentation
Source: http://arxiv.org/abs/1311.2524 github: https://github.com/rbgirshick/rcnn 1. Summary of thePaper
Quoted from the paper:
“Object detection systemoverview. Our system (1) takes an input image, (2) extracts around 2000bottom-up region proposals, (3) computes features for each proposal using alarge convolutional neural network (CNN), and then (4) classifies each regionusing
class-specific linear SVMs. R-CNN achieves a mean average precision (mAP)of 53.7% on PASCAL VOC 2010. For comparison, [39] reports 35.1% mAP using thesame region proposals, but with a spatial pyramid and bag-of-visual-wordsapproach. The popular deformable
part models perform at 33.4%. On the 200-classILSVRC2013 detection dataset, R-CNN’s mAP is 31.4%, a large improvement overOverFeat [34], which had the previous best result at 24.3%.”
To sum up, theframework of the method provided by the paper is actually very simple buteffective:
l Replacethe sliding sampling method with selective search, extracting 2,000-regionproposal candidate.
l Traink L-SVM classifier to obtain the score of each region based on the output ofthe features from AlexNet.
l Gainthe detection results by abandoning some region in accordance with NMS.
2. MainContributions
1) Onecan apply high-capacity convolutional neural networks (CNNs) to bottom-upregion proposals in order to localize and segment objects.
2) Whenlabeled training data is scarce, supervised pre-training for an auxiliary taskfollowed by domain-specific fine-tuning, yields a significant performanceboost. Combining region proposal with CNN plays an outstanding performance.
3. Positive andnegative points
Positive Points:
(i) Replacetraditional sliding windows methods with region proposal method.
(ii) Use AlexNetto extract feature. Minimize the size of region to 227*227 so that we canprovide background information as prior information.
(iii) Use Boundary-boxregression to further promote the accuracy.
(iv) Replacesoftmax with SVM because the background shared in softmax and thus SVM providesmore independent information.
Negative Points:
(i) .
4. How strong isthe evaluation
l Performancelayer by layer without fine tuning. This is to say, by using the features ofpool5, fc6, fc7 for implementing SVM, the results are very similar. Theconclusion is CNN could mostly demonstrate the information in convolutionallayer.
l Comparisonto recent feature learning methods indicates CNN’s effect is generally betterthan the other methods’ performance.
l Comparedto Google Dean et al. paper (CVPR best paper): 16% mAP in 5 minutes. Here 48%in about 1 minute!
5. Possibledirection for the future work
I’mthinking is it possible that we have some methods that improve the speed andaccuracy of object detection without using region proposal. Although regionproposal largely promote the efficiency, it inevitably neglects some importantinformation.
Hope this could be possible direction.
Source: http://arxiv.org/abs/1311.2524 github: https://github.com/rbgirshick/rcnn 1. Summary of thePaper
Quoted from the paper:
“Object detection systemoverview. Our system (1) takes an input image, (2) extracts around 2000bottom-up region proposals, (3) computes features for each proposal using alarge convolutional neural network (CNN), and then (4) classifies each regionusing
class-specific linear SVMs. R-CNN achieves a mean average precision (mAP)of 53.7% on PASCAL VOC 2010. For comparison, [39] reports 35.1% mAP using thesame region proposals, but with a spatial pyramid and bag-of-visual-wordsapproach. The popular deformable
part models perform at 33.4%. On the 200-classILSVRC2013 detection dataset, R-CNN’s mAP is 31.4%, a large improvement overOverFeat [34], which had the previous best result at 24.3%.”
To sum up, theframework of the method provided by the paper is actually very simple buteffective:
l Replacethe sliding sampling method with selective search, extracting 2,000-regionproposal candidate.
l Traink L-SVM classifier to obtain the score of each region based on the output ofthe features from AlexNet.
l Gainthe detection results by abandoning some region in accordance with NMS.
2. MainContributions
1) Onecan apply high-capacity convolutional neural networks (CNNs) to bottom-upregion proposals in order to localize and segment objects.
2) Whenlabeled training data is scarce, supervised pre-training for an auxiliary taskfollowed by domain-specific fine-tuning, yields a significant performanceboost. Combining region proposal with CNN plays an outstanding performance.
3. Positive andnegative points
Positive Points:
(i) Replacetraditional sliding windows methods with region proposal method.
(ii) Use AlexNetto extract feature. Minimize the size of region to 227*227 so that we canprovide background information as prior information.
(iii) Use Boundary-boxregression to further promote the accuracy.
(iv) Replacesoftmax with SVM because the background shared in softmax and thus SVM providesmore independent information.
Negative Points:
(i) .
4. How strong isthe evaluation
l Performancelayer by layer without fine tuning. This is to say, by using the features ofpool5, fc6, fc7 for implementing SVM, the results are very similar. Theconclusion is CNN could mostly demonstrate the information in convolutionallayer.
l Comparisonto recent feature learning methods indicates CNN’s effect is generally betterthan the other methods’ performance.
l Comparedto Google Dean et al. paper (CVPR best paper): 16% mAP in 5 minutes. Here 48%in about 1 minute!
5. Possibledirection for the future work
I’mthinking is it possible that we have some methods that improve the speed andaccuracy of object detection without using region proposal. Although regionproposal largely promote the efficiency, it inevitably neglects some importantinformation.
Hope this could be possible direction.
相关文章推荐
- gson解析json数据格式为object对象
- objective-c UIAlertController 提示框应用
- objective-c 图片处理机制
- objective-c 选项卡翻页效果
- objective-c 拨打电话(NSString扩展类)
- objective-c 加密 MD5 解密MD5
- Objective C运行时(runtime)技术总结,好强大的runtime
- Selective Search for Object Recognition解读
- 018: class, objects and instance: static method
- 017: class, objects and instance: class method
- Rich feature hierarchies for accurate object detection and semantic segmentation论文笔记
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks论文笔记
- 使用SoapObject登入网站获取通行证
- [8]姥爷幽默谈Objective-C-继承,多态,封装
- JSONArray 的 fromObject()方法执行时报错问题
- 016: class, objects and instance: instance method
- Object-C--->Swift之(七)嵌套函数与闭包
- Condition.await, signal 与 Object.wait, notify 的区别
- java泛型中的? 、Object、? extends Object的区别
- 016: class and objects > 多重继承与多态的例子