您的位置:首页 > 其它

[训练测试过程记录]SSD:Single Shot Detector 用于场景文字检测

2017-12-01 19:20 531 查看
介绍用SSD模型进行场景文字检测。举例数据集:COCO-Text。

编译部分:

1.使用cuda8编译时出错

/usr/include/boost/property_tree/detail/json_parser_read.hpp:257:264: error: ‘type name’ declared as function returning an array

解决方法:因为gcc版本过低,升级到5.3即可解决.

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update

sudo apt-get install software-properties-common

sudo apt-get install gcc-5 g++-5

cd /usr/bin

sudo rm gcc

sudo ln -s gcc-5 gcc

sudo rm g++

sudo ln -s g++-5 g++

重新编译即可解决

2. make: *** [.build_release/lib/libcaffe.so.1.0.0-rc3] 错误 

解决方法:

sudo apt-get install libopenblas-dev
同样,安装后,再重新编译即可解决

数据集准备部分:

使用coco-text数据集

1.将coco-text数据集格式化为pascal_voc的数据集格式,格式方法详见博客:[训练测试过程记录]Text-Detection-with-FRCN中的第二部分:准备数据集,这里不再赘述。

2.将formatted_dataset更名为VOC2007,并放入文件夹$home/data/VOCdevkit下面。

3.创建Imdb格式的数据:

cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh


注意:

1.修改create_list.sh和create_data.sh下面的数据集路径

create_list.sh:

#root_dir=$HOME/data/VOCdevkit/
root_dir="改为自己的数据集目录'
sub_dir=ImageSets/Main
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
for dataset in trainval test
do
dst_file=$bash_dir/$dataset.txt
if [ -f $dst_file ]
then
rm -f $dst_file
fi
#for name in VOC2007 VOC2012
#这里只有VOC2007文件夹
for name in VOC2007
do
if [[ $dataset == "test" && $name == "VOC2012" ]]
then
continue


create_data.sh:

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir=$cur_dir/../..

cd $root_dir

redo=1
#data_root_dir="$HOME/data/VOCdevkit"
data_root_dir="改为自己的数据集目录"
dataset_name="VOC0712"
mapfile="$root_dir/data/$dataset_name/labelmap_voc.prototxt"
anno_type="detection"


2.修改labelmap_voc.prototxt下的数据集类别
由于这里只有背景和text两类,因此改为:

item {
name: "none_of_the_above"
label: 0
display_name: "background"
}
item {
name: "text"
label: 1
display_name: "text"
}


训练部分:

# It will create model definition files and save snapshot models in:
#   - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
# and job file, log file, and the python script in:
#   - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
# and save temporary evaluation results in:
#   - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
# It should reach 77.* mAP at 120k iterations.
python examples/ssd/ssd_pascal.py


问题1:num_test_image的数目不对
解决方法:需要将4952换成测试图片的数量,例如coco-text的测试集数目为840。

#Evaluate on whole test set.
#num_test_image = 4952
num_test_image = 840
test_batch_size = 8
# Ideally test_batch_size should be divisible by num_test_image,
# otherwise mAP will be slightly off the true value.
test_iter = int(math.ceil(float(num_test_image) / test_batch_size))


问题2:loss = nan
由于是场景文字数据集的原因,初始迭代产生的loss特别大,我自己训练是到了iteration 40的时候,就开始变成Loss = nan了。

原因:梯度爆炸。梯度变得非常大,使得学习过程难以继续。

一般措施:减小solver.prototxt的base_lr,至少减小一个数量级。如果有多个loss layer,需要找出哪个损失层导致了梯度爆炸,并在train_val.prototxt中减小该层的loss_weight,而非是减小通用的base_lr。参考:使用caffe训练时Loss变为nan的原因

解决方法:

将base_lr变为原来的10倍。在/examples/ssd/ssd_pascal.py的第229行和232行处进行修改,将0.004改为0.0004和将0.00004改为0.000004。

# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
use_batchnorm = False
lr_mult = 1
# Use different initial learning rate.
if use_batchnorm:
   #base_lr = 0.0004
base_lr = 0.00004
else:
# A learning rate for batch_size = 1, num_gpus = 1.
#base_lr = 0.00004
base_lr = 0.000004


PS:降低学习率可能会带来loss收敛速度很慢的问题。之前想过更换训练模型,也就是将官方给点的pretrained model:fully convolutional reduced (atrous) VGGNet换成训练好了的SSD300*模型。但是由于维度不一样,将num_classes的维度由21换到了2,所以只能使用官方给的pretrained
model。目前除了降低base_lr,还没有想到其他更好的办法。

问题3:OpenCV Error: Assertion failed

OpenCV Error: Assertion failed ((scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F)) in ipp_cvtColor, file /home/user1/opencv-3.1.0/modules/imgproc/src/color.cpp, line 7646
terminate called after throwing an instance of 'cv::Exception'
what(): /home/user1/opencv-3.1.0/modules/imgproc/src/color.cpp:7646: error: (-215) (scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F) in function ipp_cvtColor

*** Aborted at 1482480286 (unix time) try "date -d @1482480286" if you are using GNU date ***
PC: @ 0x7f7e541abcc9 (unknown)
*** SIGABRT (@0x3e900004df8) received by PID 19960 (TID 0x7f7e227fd700) from PID 19960; stack trace: ***
@ 0x7f7e541abd40 (unknown)
@ 0x7f7e541abcc9 (unknown)
@ 0x7f7e541af0d8 (unknown)
@ 0x7f7e54f61535 (unknown)
@ 0x7f7e54f5f6d6 (unknown)
@ 0x7f7e54f5f703 (unknown)
@ 0x7f7e54f5f922 (unknown)
@ 0x7f7e4d12fca0 cv::error()
@ 0x7f7e4d12fe20 cv::error()
@ 0x7f7e4b574c89 cv::ipp_cvtColor()
@ 0x7f7e4b57e4d4 cv::cvtColor()
@ 0x7f7e5600758a caffe::AdjustSaturation()
@ 0x7f7e5600c77a caffe::RandomSaturation()
@ 0x7f7e5600ce96 caffe::ApplyDistort()
@ 0x7f7e561dfeac caffe::DataTransformer<>::DistortImage()
@ 0x7f7e561c7beb caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7f7e560abc29 caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@ 0x7f7e55ff39f0 caffe::InternalThread::entry()
@ 0x7f7e4abf9a4a (unknown)
@ 0x7f7e4678a182 start_thread
@ 0x7f7e5426f47d (unknown)
@ 0x0 (unknown)
Aborted (core dumped)


解决方法:在/examples/ssd/ssd_pascal.py的第175行的train_transform_param中添加'force_color':True,

参考:OpenCV Error: Assertion failed #353

train_transform_param = {
'mirror': True,
'mean_value': [104, 117, 123],
       #added
'force_color': True,
'resize_param': {
'prob': 1,
'resize_mode': P.Resize.WARP,
'height': resize_height,
'width': resize_width,
'interp_mode': [
P.Resize.LINEAR,
P.Resize.AREA,
P.Resize.NEAREST,
P.Resize.CUBIC,
P.Resize.LANCZOS4,
],
},

问题4:Check failed: mean_values_.size() == 1

F1203 16:07:24.865304 12717 data_transformer.cpp:621] Check failed: mean_values_.size() == 1 || mean_values_.size() == img_channels Specify either 1 mean_value or as many as channels: 1
*** Check failure stack trace: ***
@     0x7f6168187daa  (unknown)
@     0x7f6168187ce4  (unknown)
@     0x7f61681876e6  (unknown)
@     0x7f616818a687  (unknown)
@     0x7f61689df73d  caffe::DataTransformer<>::Transform()
@     0x7f61689e0993  caffe::DataTransformer<>::Transform()
@     0x7f61689ebcdb  caffe::DataTransformer<>::Transform()
@     0x7f61689ebd98  caffe::DataTransformer<>::Transform()
@     0x7f61689ebe3e  caffe::DataTransformer<>::Transform()
@     0x7f616887005b  caffe::AnnotatedDataLayer<>::load_batch()
@     0x7f616884f6dc  caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@     0x7f61689a1445  caffe::InternalThread::entry()
@     0x7f615e23ba4a  (unknown)
@     0x7f615729c184  start_thread
@     0x7f6166aabbed  (unknown)
@              (nil)  (unknown)
解决方法:在/examples/ssd/ssd_pascal.py的第213行的test_transform_param中添加'force_color':True,
参考:Training error for face detection training!
test_transform_param = {
'mean_value': [104, 117, 123],
'force_color': True,
'resize_param': {
'prob': 1,
'resize_mode': P.Resize.WARP,
'height': resize_height,
'width': resize_width,
'interp_mode': [P.Resize.LINEAR],
},
}


问题5:status == CUDNN_STATUS_SUCCESS

F0616 16:54:55.034394 3070141376 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR

原因:内存不够

解决方法:减小batch_size,例如我在这里将训练的batch_size由32减小到了8,修改处为在/examples/ssd/ssd_pascal.py的第338行和第339行。

#Divide the mini-batch to different GPUs.
#batch_size = 32
#accum_batch_size = 32
batch_size = 8
accum_batch_size = 8
iter_size = accum_batch_size / batch_size
solver_mode = P.Solver.CPU
device_id = 0


问题6:Check failed: label_to_name_.find(label) !=lable_to_name_.name() Cannot find label: 2 in the label map

F1027 detection_output_layer.cu:143] Check failed: label_to_name_.find(label) !=lable_to_name_.name() Cannot find label: 2 in the label map

这是由于将类别由21类改为2类造成的。

解决方法:

1.将examples/ssd/ssd_pascal.py中的269行中的num_classes ,由21改为2。

# MultiBoxLoss parameters.
# num_classes = 21
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.


2.将examples/ssd/score_ssd_pascal.py中的277行中的num_classes ,由21改为2。

# MultiBoxLoss parameters.
# num_classes = 21
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.


在py文件中改过之后,以下目录中的train.prototxt,test.prototxt,deploy.prototxt中的num_classes也会随之改变:

caffe/jobs/VGGNet/VOC0712/SSD_300x300

caffe/jobs/VGGNet/VOC0712/SSD_300x300_score

caffe/models/VGGNet/VOC0712/SSD_300x300

caffe/models/VGGNet/VOC0712/SSD_300x300_score
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: