SqueezeNet运用到Faster RCNN进行目标检测+OHEM
2016-12-17 22:01
435 查看
目录
目录一SqueezeNet介绍
MOTIVATION
FIRE MODULE
ARCHITECTURE
EVALUATION
二SqueezeNet与Faster RCNN结合
三SqueezeNetFaster RCNNOHEM
原文链接
一、SqueezeNet介绍
论文提交ICLR 2017论文地址:https://arxiv.org/abs/1602.07360
代码地址:https://github.com/DeepScale/SqueezeNet
注:代码只放出了prototxt文件和训练好的caffemodel,因为整个网络都是基于caffe的,有这两样东西就足够了。
在这里只是简要的介绍文章的内容,具体细节的东西可以自行翻阅论文。
MOTIVATION
在相同的精度下,模型参数更少有3个好处:More efficient distributed training
Less overhead when exporting new models to clients
Feasible FPGA and embedded deployment
即 高效的分布式训练、更容易替换模型、更方便FPGA和嵌入式部署。
鉴于此,提出3种策略:
Replace 3x3 filters with 1x1 filters.
Decrease the number of input channels to 3x3 filters.
Downsample late in the network so that convolution layers have large activation maps.
即
使用1x1的核替换3x3的核,因为1x1核参数是3x3的1/9;
输入通道减少3x3核的数量,因为参数的数量由输入通道数、卷积核数、卷积核的大小决定。因此,减少1x1的核数量还不够,还需要减少输入通道数量,在文中,作者使用squeeze layer来达到这一目的;
后移池化层,得到更大的feature map。作者认为在网络的前段使用大的步长进行池化,后面的feature map将会减小,而大的feature map会有较高的准确率。
FIRE MODULE
由上面的思路,作者提出了Fire Module,结构如下:ARCHITECTURE
关于SqueezeNet的构建细节在文中也有详细的描述
为了3x3的核输出的feature map和1x1的大小相同,padding取1(主要是为了concat)
squeezelayer和expandlayer后面跟ReLU激活函数
Dropout比例为0.5,跟在fire9后面
取消全连接,参考NIN结构
训练过程采用多项式学习率(我用来做检测时改为了step策略)
由于caffe不支持同一个卷积层既有1x1,又有3x3,所以需要concat,将两个分辨率的图在channel维度concat。这在数学上是等价的
EVALUATION
二、SqueezeNet与Faster RCNN结合
这里,我首先尝试的是使用alt-opt,但是很遗憾的是,出来的结果很糟糕,基本不能用,后来改为使用end2end,在最开始的时候,采用的就是faster rcnn官方提供的zfnet end2end训练的solvers,又很不幸的是,在网络运行大概400步后出现:loss = NAN
遇到这个问题,把学习率改为以前的1/10,解决。
直接上prototxt文件,前面都是一样的,只需要改动zfnet中的conv1-con5部分,外加把fc6-fc7改成squeeze中的卷积链接。
prototxt太长,给出每个部分的前面和后面部分:
name: "Alex_Squeeze_v1.1" layer { name: 'input-data' type: 'Python' top: 'data' top: 'im_info' top: 'gt_boxes' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 4" } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" convolution_param { num_output: 64 kernel_size: 3 stride: 2 } } . . . layer { name: "drop9" type: "Dropout" bottom: "fire9/concat" top: "fire9/concat" dropout_param { dropout_ratio: 0.5 } } #========= RPN ============ layer { name: "rpn_conv/3x3" type: "Convolution" bottom: "fire9/concat" top: "rpn/output" param { lr_mult: 1.0 } param { lr_mult: 2.0 } convolution_param { num_output: 256 kernel_size: 3 pad: 1 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } . . . layer { name: "drop9" type: "Dropout" bottom: "fire9/concat" top: "fire9/concat" dropout_param { dropout_ratio: 0.5 } } #========= RPN ============ layer { name: "rpn_conv/3x3" type: "Convolution" bottom: "fire9/concat" top: "rpn/output" param { lr_mult: 1.0 } param { lr_mult: 2.0 } convolution_param { num_output: 256 kernel_size: 3 pad: 1 stride: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } . . . layer { name: 'roi-data' type: 'Python' bottom: 'rpn_rois' bottom: 'gt_boxes' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights' python_param { module: 'rpn.proposal_target_layer' layer: 'ProposalTargetLayer' param_str: "'num_classes': 4" } } #===================== RCNN ============= layer { name: "roi_pool5" type: "ROIPooling" bottom: "fire9/concat" bottom: "rois" top: "roi_pool5" roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } } layer { name: "conv1_last" type: "Convolution" bottom: "roi_pool5" top: "conv1_last" param { lr_mult: 1.0 } param { lr_mult: 1.0 } convolution_param { num_output: 1000 kernel_size: 1 weight_filler { type: "gaussian" mean: 0.0 std: 0.01 } } } layer { name: "relu/conv1_last" type: "ReLU" bottom: "conv1_last" top: "relu/conv1_last" } layer { name: "cls_score" type: "InnerProduct" bottom: "relu/conv1_last" top: "cls_score" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 5 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred" type: "InnerProduct" bottom: "relu/conv1_last" top: "bbox_pred" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 20 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } layer { name: "loss_cls" type: "SoftmaxWithLoss" bottom: "cls_score" bottom: "labels" propagate_down: 1 propagate_down: 0 top: "loss_cls" loss_weight: 1 } layer { name: "loss_bbox" type: "SmoothL1Loss" bottom: "bbox_pred" bottom: "bbox_targets" bottom: "bbox_inside_weights" bottom: "bbox_outside_weights" top: "loss_bbox" loss_weight: 1 }
后面一部分的结构如图:
注意红圈部分,以前的fc换成了squ中的卷积层,这样网络参数大大减少,因为我改动了rpn部分选proposal的比例和数量,共采用改了70种选择,所以最后训练出来的模型为17M,比初始化4.8M大很多,不过也已经很小了。
三、SqueezeNet+Faster RCNN+OHEM
OHEM无非就是多了一个readonly部分,不过加上之后效果会好很多,和上面的方式一致,放出一部分prototxt,其他的课自行补上。从rpn那里开始,前面部分和上面给出的完全一样#====== RoI Proposal ==================== layer { name: "rpn_cls_prob" type: "Softmax" bottom: "rpn_cls_score_reshape" top: "rpn_cls_prob" } layer { name: 'rpn_cls_prob_reshape' type: 'Reshape' bottom: 'rpn_cls_prob' top: 'rpn_cls_prob_reshape' reshape_param { shape { dim: 0 dim: 140 dim: -1 dim: 0 } } } layer { name: 'proposal' type: 'Python' bottom: 'rpn_cls_prob_reshape' bottom: 'rpn_bbox_pred' bottom: 'im_info' top: 'rpn_rois' python_param { module: 'rpn.proposal_layer' layer: 'ProposalLayer' param_str: "'feat_stride': 16" } } layer { name: 'roi-data' type: 'Python' bottom: 'rpn_rois' bottom: 'gt_boxes' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights' python_param { module: 'rpn.proposal_target_layer' layer: 'ProposalTargetLayer' param_str: "'num_classes': 4" } } ########################## ## Readonly RoI Network ## ######### Start ########## layer { name: "roi_pool5_readonly" type: "ROIPooling" bottom: "fire9/concat" bottom: "rois" top: "pool5_readonly" propagate_down: false propagate_down: false roi_pooling_param { pooled_w: 6 pooled_h: 6 spatial_scale: 0.0625 # 1/16 } } layer { name: "conv1_last_readonly" type: "Convolution" bottom: "pool5_readonly" top: "conv1_last_readonly" propagate_down: false param { name: "conv1_last_w" } param { name: "conv1_last_b" } convolution_param { num_output: 1000 kernel_size: 1 weight_filler { type: "gaussian" mean: 0.0 std: 0.01 } } } layer { name: "relu/conv1_last_readonly" type: "ReLU" bottom: "conv1_last_readonly" top: "relu/conv1_last_readonly" propagate_down: false } layer { name: "cls_score_readonly" type: "InnerProduct" bottom: "relu/conv1_last_readonly" top: "cls_score_readonly" propagate_down: false param { name: "cls_score_w" } param { name: "cls_score_b" } inner_product_param { num_output: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred_readonly" type: "InnerProduct" bottom: "relu/conv1_last_readonly" top: "bbox_pred_readonly" propagate_down: false param { name: "bbox_pred_w" } param { name: "bbox_pred_b" } inner_product_param { num_output: 16 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } layer { name: "cls_prob_readonly" type: "Softmax" bottom: "cls_score_readonly" top: "cls_prob_readonly" propagate_down: false } layer { name: "hard_roi_mining" type: "Python" bottom: "cls_prob_readonly" bottom: "bbox_pred_readonly" bottom: "rois" bottom: "labels" bottom: "bbox_targets" bottom: "bbox_inside_weights" bottom: "bbox_outside_weights" top: "rois_hard" top: "labels_hard" top: "bbox_targets_hard" top: "bbox_inside_weights_hard" top: "bbox_outside_weights_hard" propagate_down: false propagate_down: false propagate_down: false propagate_down: false propagate_down: false propagate_down: false propagate_down: false python_param { module: "roi_data_layer.layer" layer: "OHEMDataLayer" param_str: "'num_classes': 4" } } ########## End ########### ## Readonly RoI Network ## ########################## #===================== RCNN ============= layer { name: "roi_pool5" type: "ROIPooling" bottom: "fire9/concat" bottom: "rois_hard" top: "roi_pool5" propagate_down: true propagate_down: false roi_pooling_param { pooled_w: 7 pooled_h: 7 spatial_scale: 0.0625 # 1/16 } } layer { name: "conv1_last" type: "Convolution" bottom: "roi_pool5" top: "conv1_last" param { lr_mult: 1.0 name: "conv1_last_w" } param { lr_mult: 1.0 name: "conv1_last_b" } convolution_param { num_output: 1000 kernel_size: 1 weight_filler { type: "gaussian" mean: 0.0 std: 0.01 } } } layer { name: "relu/conv1_last" type: "ReLU" bottom: "conv1_last" top: "relu/conv1_last" } layer { name: "cls_score" type: "InnerProduct" bottom: "relu/conv1_last" top: "cls_score" param { lr_mult: 1 name: "cls_score_w" } param { lr_mult: 2 name: "cls_score_b" } inner_product_param { num_output: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "bbox_pred" type: "InnerProduct" bottom: "relu/conv1_last" top: "bbox_pred" param { lr_mult: 1 name: "bbox_pred_w" } param { lr_mult: 2 name: "bbox_pred_b" } inner_product_param { num_output: 16 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } } layer { name: "loss_cls" type: "SoftmaxWithLoss" bottom: "cls_score" bottom: "labels_hard" propagate_down: true propagate_down: false top: "loss_cls" loss_weight: 1 } layer { name: "loss_bbox" type: "SmoothL1Loss" bottom: "bbox_pred" bottom: "bbox_targets_hard" bottom: "bbox_inside_weights_hard" bottom: "bbox_outside_weights_hard" top: "loss_bbox" loss_weight: 1 propagate_down: false propagate_down: false propagate_down: false propagate_down: false }
结构图如下:
比前面训练的多一个readonly部分,具体可参考论文:
Training Region-based Object Detectors with Online Hard Example Mining
https://arxiv.org/abs/1604.03540
至此,SqueezeNet+Faster RCNN 框架便介绍完了,运行速度在GPU下大概是ZF的5倍,CPU下大概为2。5倍。
原文链接:
http://blog.csdn.net/u011956147/article/details/53714616相关文章推荐
- CUDA搭建
- 稀疏自动编码器 (Sparse Autoencoder)
- 白化(Whitening):PCA vs. ZCA
- softmax回归
- 卷积神经网络初探
- Prisma修图软件的图片风格转换算法
- TensorFlow人工智能引擎入门教程之九 RNN/LSTM循环神经网络长短期记忆网络使用
- TensorFlow人工智能引擎入门教程之十 最强网络 RSNN深度残差网络 平均准确率96-99%
- TensorFlow人工智能入门教程之十一 最强网络DLSTM 双向长短期记忆网络(阿里小AI实现)
- TensorFlow人工智能入门教程之十三 RCNN 区域卷积网络(视频侦测分析人脸侦测区域检测 )
- TensorFlow人工智能入门教程之十四 自动编码机AutoEncoder 网络
- TensorFlow人工智能引擎入门教程所有目录
- 如何用70行代码实现深度神经网络算法
- 近200篇机器学习&深度学习资料分享(含各种文档,视频,源码等)
- 轻量级深度学习框架 Keras简介
- 基于 10 大编程语言的 30 个深度学习库
- 深度学习 - 环境安装
- 深度学习 - 入门