您的位置：首页 > 其它

深度学习实践经验：用Faster R-CNN训练行人检测数据集Caltech——准备工作

2017-02-18 21:16 926 查看

前言

Faster R-CNN是Ross Girshick大神在Fast R-CNN基础上提出的又一个更加快速、更高mAP的用于目标检测的深度学习框架，它对Fast R-CNN进行的最主要的优化就是在Region Proposal阶段，引入了Region Proposal Network (RPN)来进行Region Proposal，同时可以达到和检测网络共享整个图片的卷积网络特征的目标，使得region proposal几乎是cost free的。

关于Faster R-CNN的详细介绍，可以参考我上一篇博客。

Faster R-CNN的代码是开源的，有两个版本：MATLAB版本(faster_rcnn)，Python版本(py-faster-rcnn)。

这里我主要使用的是Python版本，Python版本在测试期间会比MATLAB版本慢10%，因为Python layers中的一些操作是在CPU中执行的，但是准确率应该是差不多的。

准备工作1——py-faster-rcnn的编译安装测试

py-faster-rcnn的编译安装

克隆Faster R-CNN仓库：

git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git[/code] 
一定要加上--recursive
标志，假设克隆后的文件夹名字叫py-faster-rcnn


编译Cython模块：

cd py-faster-rcnn/lib
make


编译里面的Caffe和pycaffe：

cd py-faster-rcnn/caffe-fast-rcnn

# 按照编译Caffe的方法，进行编译

# 注意Makefile.config的修改，这里不再赘述Caffe的安装

# 编译

make -j8 && make pycaffe


这里贴上我的Makefile.config
文件代码，根据你的情况进行相应修改

## Refer to http://caffe.berkeleyvision.org/installation.html 
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).

USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).

# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers

# USE_OPENCV := 0

# USE_LEVELDB := 0

# USE_LMDB := 0

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)

# You should not set this flag if you will be reading LMDBs with any

# possibility of simultaneous read and write

# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3

OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.

# N.B. the default for Linux is g++ and the default for OSX is clang++

# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.

CUDA_DIR := /usr/local/cuda

# On Ubuntu 14.04, if cuda tools are installed via

# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:

# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.

# For CUDA < 6.0, comment the *_50 lines for compatibility.

CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_50,code=compute_50

# BLAS choice:

# atlas for ATLAS (default)

# mkl for MKL

# open for OpenBlas

BLAS :=mkl

# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.

# Leave commented to accept the defaults for your choice of BLAS

# (which should work)!

# BLAS_INCLUDE := /path/to/your/blas

# BLAS_LIB := /path/to/your/blas

# Homebrew puts openblas in a directory that is not on the standard search path

# BLAS_INCLUDE := $(shell brew --prefix openblas)/include

# BLAS_LIB := $(shell brew --prefix openblas)/lib

# This is required only if you will compile the matlab interface.

# MATLAB directory should contain the mex binary in /bin.

MATLAB_DIR := /usr/local/MATLAB/R2016b

# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.

# We need to be able to find Python.h and numpy/arrayobject.h.

# PYTHON_INCLUDE := /usr/include/python2.7 \

/usr/lib/python2.7/dist-packages/numpy/core/include

# Anaconda Python distribution is quite popular. Include path:

# Verify anaconda location, sometimes it's in root.

ANACONDA_HOME := $(HOME)/anaconda
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python2.7 \
$(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \
$ /usr/include/python2.7

# Uncomment to use Python 3 (default is Python 2)

# PYTHON_LIBRARIES := boost_python3 python3.5m

# PYTHON_INCLUDE := /usr/include/python3.5m \

# /usr/lib/python3.5/dist-packages/numpy/core/include

# We need to be able to find libpythonX.X.so or .dylib.

# PYTHON_LIB := /usr/lib

PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)

# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include

# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)

WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.

# INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include

# LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

INCLUDE_DIRS := $(PYTHON_INCL
4000
UDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies

# INCLUDE_DIRS += $(shell brew --prefix)/include

# LIBRARY_DIRS += $(shell brew --prefix)/lib

# Uncomment to use `pkg-config` to specify OpenCV library paths.

# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)

# USE_PKG_CONFIG := 1

# N.B. both build and distribute dirs are cleared on `make clean`

BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171 
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.

TEST_GPUID := 0

# enable pretty build (comment to see full commands)

Q ?= @


Demo运行
为了检验你的py-faster-rcnn是否成功安装，作者给出了一个demo，可以利用在PASCAL VOC2007数据集上体现训练好的模型，来进行demo的运行，步骤如下：

下载预训练好的Faster R-CNN检测器：

cd py-faster-rcnn
./data/scripts/fetch_faster_rcnn_models.sh


这条命令会自动下载名为faster_rcnn_models.tgz
的文件，解压后会创建data/faster_rcnn_models
文件夹，里面会有两个模型：

ZF_faster_rcnn_final.caffemodel：在ZF网络模型下训练所得

VGG16_faster_rcnn_final.caffemodel：在VGG16网络模型下训练所得。

运行demo：

cd py-faster-rcnn
./tools/demo.py


demo会检测5张图片，这5张图片放在data/demo/
文件夹下，其中一张的检测结果如下：



至此如果上述过程没有出错，那么py-faster-rcnn算是成功编译安装。

准备工作2——Caltech数据集
由于Faster R-CNN的一部分实验是在PASCAL VOC2007数据集上进行的，所以要想用Faster R-CNN训练我们自己的数据集，首先应该搞清楚PASCAL VOC2007数据集中的目录、图片、标注格式，这样我们才能用自己的数据集制作出类似于PASCAL VOC2007类似的数据集，供Faster R-CNN来进行训练及测试。

获取PASCAL VOC2007数据集
这一部分不是必须的，如果你需要PASCAL VOC2007数据集，可以利用以下命令获取数据集，但我们下载VOC数据集的目的主要是观察他的文件结构和文件内容，以便于我们构建符合要求的自己的数据集。

创建一个专门用来存数据集的地方，假设是$HOME/data
文件夹。

下载PASCAL VOC2007的训练、验证和测试数据集：

cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar[/code] 
下载完后用以下命令解压：

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar


会得到如下文件结构：

$HOME/data/VOCdevkit/                        # 根文件夹
$HOME/data/VOCdevkit/VOC2007                 # VOC2007文件夹
$HOME/data/VOCdevkit/VOC2007/Annotations     # 标记文件夹
$HOME/data/VOCdevkit/VOC2007/ImageSets       # 供train.txt、test.txt、val.txt等文件存放的文件夹
$HOME/data/VOCdevkit/VOC2007/JPEGImages      # 存放图片文件夹

# ... 以及其他的文件夹及子文件夹 ...


创建快捷方式symlinks来连接到VOC数据集存放的地方：

cd py-faster-rcnn/data
ln -s $HOME/data/VOCdevkit/ VOCdevkit


这里需要把$HOME/data/VOCdevkit/
改为你存放VOCdevkit
文件夹的路径

最好使用symlinks来在共享同一份数据集，防止数据集多处拷贝，占用空间。

至此VOC数据集创建完毕。

PASCAL VOC数据集的分析
PASCAL VOC数据集的文件结构，如下：

└── VOCdevkit
└── VOC2007　
├── Annotations　　
├── ImageSets　　
│   ├── Layout　　
│   ├── Main　　
│   └── Segmentation　　
├── JPEGImages　　
├── SegmentationClass　　
└── SegmentationObject


Annotations
该文件夹主要用来存放图片标注（即为ground truth），文件是.xml格式，每张图片都有一个.xml文件与之对应。选取其中一个文件进行如下分析：

<annotation>
<folder>VOC2007</folder> # 必须有，父文件夹的名称
<filename>000005.jpg</filename>　#　必须有
<source>　# 可有可无
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
<flickrid>325991873</flickrid>
</source>
<owner>　# 可有可无
<flickrid>archintent louisville</flickrid>
<name>?</name>
</owner>
<size>　# 表示图像大小
<width>500</width>
<height>375</height>
<depth>3</depth>
</size>
<segmented>0</segmented>　# 用于分割
<object>　# 目标信息，类别，bbox信息，图片中每个目标对应一个<object>标签
<name>chair</name>
<pose>Rear</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>263</xmin>
<ymin>211</ymin>
<xmax>324</xmax>
<ymax>339</ymax>
</bndbox>
</object>
<object>
<name>chair</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>1</difficult>
<bndbox>
<xmin>5</xmin>
<ymin>244</ymin>
<xmax>67</xmax>
<ymax>374</ymax>
</bndbox>
</object>
</annotation>


需要注意的，对于我们自己准备的xml标记文件中，每个<object>
标签中的<xmin>
和<ymin>
标签中所对应的坐标值最好大于0，千万不能为负数，否则在训练过程中会报错：AssertionError: assert (boxes[:, 2]) >= boxes[:, 0]).all()
，如下：



所以为了能够顺利训练，一定要仔细检查自己的xml文件中的左上角的坐标是否都为正。我被这个bug卡了一两天，最终把自己标记中所有的错误坐标找出来，才得以顺利训练。

ImageSets
ImageSets文件夹下有三个子文件夹，这里我们只需关注Main文件夹即可。Main文件夹下主要用到的是train.txt、val.txt、test.txt、trainval.txt文件，每个文件中写着供训练、验证、测试所用的文件名的集合，如下：



JPEGImages
JPEGImages文件夹下主要存放着所有的.jpg文件格式的输入图片，不在赘述。

制作VOC类似的Caltech数据集
经过以上对PASCAL VOC数据集文件结构的分析，我们仿照其，创建首先创建类似的文件结构即可：

└── VOCdevkit
└── VOC2007　
└── Caltech　
├── Annotations　　
├── ImageSets　　　
│   └── Main　　
└── JPEGImages


我建议将Caltech文件创建一个symlinks链接到VOCdevkit文件夹之下，因为这样会方便之后训练代码的修改。

至于Caltech数据集如何从.seq文件转化为一张张.jpg图片，这里可以参考这里。

至于Annotations中一个个.xml标记文件是实验室师兄给我的，上面提到的方法也可以转化，但是并不符合要求。

至于ImageSets中的train.txt是根据.xml文件得来的，test.txt是每个seq中每隔30帧取一帧图片得来的。

以上所有和Caltech数据集有关的文件，都可以直接邮件与我联系，我直接发给你，可以省下不少制作数据集的时间。

参考博客
FastRCNN 训练自己数据集 (1编译配置)

目标检测–Faster RCNN2

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 深度学习 FasterRCNN 经验

相关文章推荐

新的分享

章节导航