您的位置:首页 > 产品设计 > UI/UE

SqueezeNet运用到Faster RCNN进行目标检测+OHEM

2016-12-17 22:01 435 查看

目录

目录

一SqueezeNet介绍
MOTIVATION

FIRE MODULE

ARCHITECTURE

EVALUATION

二SqueezeNet与Faster RCNN结合

三SqueezeNetFaster RCNNOHEM

原文链接

一、SqueezeNet介绍

论文提交ICLR 2017

论文地址:https://arxiv.org/abs/1602.07360

代码地址:https://github.com/DeepScale/SqueezeNet

注:代码只放出了prototxt文件和训练好的caffemodel,因为整个网络都是基于caffe的,有这两样东西就足够了。

在这里只是简要的介绍文章的内容,具体细节的东西可以自行翻阅论文。

MOTIVATION

在相同的精度下,模型参数更少有3个好处:

More efficient distributed training

Less overhead when exporting new models to clients

Feasible FPGA and embedded deployment

即 高效的分布式训练、更容易替换模型、更方便FPGA和嵌入式部署。

鉴于此,提出3种策略:

Replace 3x3 filters with 1x1 filters.

Decrease the number of input channels to 3x3 filters.

Downsample late in the network so that convolution layers have large activation maps.



使用1x1的核替换3x3的核,因为1x1核参数是3x3的1/9;

输入通道减少3x3核的数量,因为参数的数量由输入通道数、卷积核数、卷积核的大小决定。因此,减少1x1的核数量还不够,还需要减少输入通道数量,在文中,作者使用squeeze layer来达到这一目的;

后移池化层,得到更大的feature map。作者认为在网络的前段使用大的步长进行池化,后面的feature map将会减小,而大的feature map会有较高的准确率。

FIRE MODULE

由上面的思路,作者提出了Fire Module,结构如下:



ARCHITECTURE



关于SqueezeNet的构建细节在文中也有详细的描述

为了3x3的核输出的feature map和1x1的大小相同,padding取1(主要是为了concat)

squeezelayer和expandlayer后面跟ReLU激活函数

Dropout比例为0.5,跟在fire9后面

取消全连接,参考NIN结构

训练过程采用多项式学习率(我用来做检测时改为了step策略)

由于caffe不支持同一个卷积层既有1x1,又有3x3,所以需要concat,将两个分辨率的图在channel维度concat。这在数学上是等价的

EVALUATION



二、SqueezeNet与Faster RCNN结合

这里,我首先尝试的是使用alt-opt,但是很遗憾的是,出来的结果很糟糕,基本不能用,后来改为使用end2end,在最开始的时候,采用的就是faster rcnn官方提供的zfnet end2end训练的solvers,又很不幸的是,在网络运行大概400步后出现:

loss = NAN


遇到这个问题,把学习率改为以前的1/10,解决。

直接上prototxt文件,前面都是一样的,只需要改动zfnet中的conv1-con5部分,外加把fc6-fc7改成squeeze中的卷积链接。

prototxt太长,给出每个部分的前面和后面部分:

name: "Alex_Squeeze_v1.1"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 4"
}
}

layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 64
kernel_size: 3
stride: 2
}
}
.
.
.
layer {
name: "drop9"
type: "Dropout"
bottom: "fire9/concat"
top: "fire9/concat"
dropout_param {
dropout_ratio: 0.5
}
}

#========= RPN ============

layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "fire9/concat"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 256
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
.
.
.
layer {
name: "drop9"
type: "Dropout"
bottom: "fire9/concat"
top: "fire9/concat"
dropout_param {
dropout_ratio: 0.5
}
}

#========= RPN ============

layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "fire9/concat"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 256
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
.
.
.
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 4"
}
}

#===================== RCNN =============

layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "fire9/concat"
bottom: "rois"
top: "roi_pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}

layer {
name: "conv1_last"
type: "Convolution"
bottom: "roi_pool5"
top: "conv1_last"
param { lr_mult: 1.0 }
param { lr_mult: 1.0 }
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu/conv1_last"
type: "ReLU"
bottom: "conv1_last"
top: "relu/conv1_last"
}

layer {
name: "cls_score"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "cls_score"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "bbox_pred"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 20
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels"
propagate_down: 1
propagate_down: 0
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "loss_bbox"
loss_weight: 1
}


后面一部分的结构如图:



注意红圈部分,以前的fc换成了squ中的卷积层,这样网络参数大大减少,因为我改动了rpn部分选proposal的比例和数量,共采用改了70种选择,所以最后训练出来的模型为17M,比初始化4.8M大很多,不过也已经很小了。

三、SqueezeNet+Faster RCNN+OHEM

OHEM无非就是多了一个readonly部分,不过加上之后效果会好很多,和上面的方式一致,放出一部分prototxt,其他的课自行补上。从rpn那里开始,前面部分和上面给出的完全一样

#====== RoI Proposal ====================
layer {
name: "rpn_cls_prob"
type: "Softmax"
bottom: "rpn_cls_score_reshape"
top: "rpn_cls_prob"
}
layer {
name: 'rpn_cls_prob_reshape'
type: 'Reshape'
bottom: 'rpn_cls_prob'
top: 'rpn_cls_prob_reshape'
reshape_param { shape { dim: 0 dim: 140 dim: -1 dim: 0 } }
}
layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rpn_rois'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 4"
}
}
##########################
## Readonly RoI Network ##
######### Start ##########
layer {
name: "roi_pool5_readonly"
type: "ROIPooling"
bottom: "fire9/concat"
bottom: "rois"
top: "pool5_readonly"
propagate_down: false
propagate_down: false
roi_pooling_param {
pooled_w: 6
pooled_h: 6
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "conv1_last_readonly"
type: "Convolution"
bottom: "pool5_readonly"
top: "conv1_last_readonly"
propagate_down: false
param {
name: "conv1_last_w"
}
param {
name: "conv1_last_b"
}
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu/conv1_last_readonly"
type: "ReLU"
bottom: "conv1_last_readonly"
top: "relu/conv1_last_readonly"
propagate_down: false
}
layer {
name: "cls_score_readonly"
type: "InnerProduct"
bottom: "relu/conv1_last_readonly"
top: "cls_score_readonly"
propagate_down: false
param {
name: "cls_score_w"
}
param {
name: "cls_score_b"
}
inner_product_param {
num_output: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred_readonly"
type: "InnerProduct"
bottom: "relu/conv1_last_readonly"
top: "bbox_pred_readonly"
propagate_down: false
param {
name: "bbox_pred_w"
}
param {
name: "bbox_pred_b"
}
inner_product_param {
num_output: 16
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "cls_prob_readonly"
type: "Softmax"
bottom: "cls_score_readonly"
top: "cls_prob_readonly"
propagate_down: false
}
layer {
name: "hard_roi_mining"
type: "Python"
bottom: "cls_prob_readonly"
bottom: "bbox_pred_readonly"
bottom: "rois"
bottom: "labels"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "rois_hard"
top: "labels_hard"
top: "bbox_targets_hard"
top: "bbox_inside_weights_hard"
top: "bbox_outside_weights_hard"
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
python_param {
module: "roi_data_layer.layer"
layer: "OHEMDataLayer"
param_str: "'num_classes': 4"
}
}
########## End ###########
## Readonly RoI Network ##
##########################
#===================== RCNN =============
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "fire9/concat"
bottom: "rois_hard"
top: "roi_pool5"
propagate_down: true
propagate_down: false
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "conv1_last"
type: "Convolution"
bottom: "roi_pool5"
top: "conv1_last"
param {
lr_mult: 1.0
name: "conv1_last_w"
}
param {
lr_mult: 1.0
name: "conv1_last_b"
}
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu/conv1_last"
type: "ReLU"
bottom: "conv1_last"
top: "relu/conv1_last"
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "cls_score"
param {
lr_mult: 1
name: "cls_score_w"
}
param {
lr_mult: 2
name: "cls_score_b"
}
inner_product_param {
num_output: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "bbox_pred"
param {
lr_mult: 1
name: "bbox_pred_w"
}
param {
lr_mult: 2
name: "bbox_pred_b"
}
inner_product_param {
num_output: 16
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels_hard"
propagate_down: true
propagate_down: false
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets_hard"
bottom: "bbox_inside_weights_hard"
bottom: "bbox_outside_weights_hard"
top: "loss_bbox"
loss_weight: 1
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
}


结构图如下:



比前面训练的多一个readonly部分,具体可参考论文:

Training Region-based Object Detectors with Online Hard Example Mining

https://arxiv.org/abs/1604.03540



至此,SqueezeNet+Faster RCNN 框架便介绍完了,运行速度在GPU下大概是ZF的5倍,CPU下大概为2。5倍。

原文链接:

http://blog.csdn.net/u011956147/article/details/53714616
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息