目标检测--PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
2016-09-05 10:53
691 查看
https://www.arxiv.org/abs/1608.08021
Demo code: https://github.com/sanghoon/pva-faster-rcnn
本文针对多种类目标检测这个问题,结合当前各种最新技术成果,达到很好的结果。
We obtained solid results on well-known object detection benchmarks: 81.8% mAP (mean average precision) on VOC2007 and 82.5% mAP on VOC2012 (2nd place), while taking only 750ms/image on Intel i7-6700K CPU with a single core and 46ms/image on NVIDIA Titan X GPU. Theoretically, our network requires only 12.3% of the computational cost compared to ResNet-101, the winner on VOC2012
针对整体检测框架:CNN feature extraction + region proposal + RoI classification
我们主要优化 feature extraction,因为 region proposal part 速度比较快,不占用什么时间。分类部分可以通过 SVD 进行有效压缩模型复杂度。 我们的设计原则是: 少点特征种类,多点层数。less channels with more layers。 设计网络采用了 concatenated ReLU, Inception, and HyperNet,训练时采用 batch normalization, residual connections, and learning rate scheduling based on plateau detection。
2 Details on Network Design
2.1 C.ReLU: Earlier building blocks in feature generation
C.ReLU 主要用于卷积前几层,降低输出通道一半,然后通过取负得到对应的输出通道,这要提高速度一倍。
C.ReLU reduces the number of output channels by half, and doubles it by simply concatenating the same outputs with negation, which leads to 2x speed-up of the early stage without losing accuracy.
2.2 Inception: Remaining building blocks in feature generation
Inception 对于小目标和大目标都可以很好的解决,主要是通过控制卷积核尺寸来实验的。
2.3 HyperNet: Concatenation of multi-scale intermediate outputs
主要是将不同尺度的卷积特征层结合起来。可以进行多尺度目标检测。
2.4 Deep network training
这里我们在 inception层间加入 residual structures 。 在所有的 ReLU 激活层前加入 Batch normalization 层。 基于 plateau detection 动态控制学习率。
3 Faster R-CNN with our feature extraction network
我们将卷积 3_4层(下采样),卷积 层4_4 卷积层5_4 (上采样)结合为512通道的多尺度输出特征作为 Faster R-CNN模型的输入。
Three intermediate outputs from conv3_4 (with down-scaling), conv4_4, and conv5_4 (with up-scaling) are combined into the 512-channel multi-scale output features
4 Experimental results
Demo code: https://github.com/sanghoon/pva-faster-rcnn
本文针对多种类目标检测这个问题,结合当前各种最新技术成果,达到很好的结果。
We obtained solid results on well-known object detection benchmarks: 81.8% mAP (mean average precision) on VOC2007 and 82.5% mAP on VOC2012 (2nd place), while taking only 750ms/image on Intel i7-6700K CPU with a single core and 46ms/image on NVIDIA Titan X GPU. Theoretically, our network requires only 12.3% of the computational cost compared to ResNet-101, the winner on VOC2012
针对整体检测框架:CNN feature extraction + region proposal + RoI classification
我们主要优化 feature extraction,因为 region proposal part 速度比较快,不占用什么时间。分类部分可以通过 SVD 进行有效压缩模型复杂度。 我们的设计原则是: 少点特征种类,多点层数。less channels with more layers。 设计网络采用了 concatenated ReLU, Inception, and HyperNet,训练时采用 batch normalization, residual connections, and learning rate scheduling based on plateau detection。
2 Details on Network Design
2.1 C.ReLU: Earlier building blocks in feature generation
C.ReLU 主要用于卷积前几层,降低输出通道一半,然后通过取负得到对应的输出通道,这要提高速度一倍。
C.ReLU reduces the number of output channels by half, and doubles it by simply concatenating the same outputs with negation, which leads to 2x speed-up of the early stage without losing accuracy.
2.2 Inception: Remaining building blocks in feature generation
Inception 对于小目标和大目标都可以很好的解决,主要是通过控制卷积核尺寸来实验的。
2.3 HyperNet: Concatenation of multi-scale intermediate outputs
主要是将不同尺度的卷积特征层结合起来。可以进行多尺度目标检测。
2.4 Deep network training
这里我们在 inception层间加入 residual structures 。 在所有的 ReLU 激活层前加入 Batch normalization 层。 基于 plateau detection 动态控制学习率。
3 Faster R-CNN with our feature extraction network
我们将卷积 3_4层(下采样),卷积 层4_4 卷积层5_4 (上采样)结合为512通道的多尺度输出特征作为 Faster R-CNN模型的输入。
Three intermediate outputs from conv3_4 (with down-scaling), conv4_4, and conv5_4 (with up-scaling) are combined into the 512-channel multi-scale output features
4 Experimental results
相关文章推荐
- 目标检测--PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- READING NOTE: PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection - arxiv 2016.08
- PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- 多尺度R-CNN论文笔记(4): PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- 读书笔记PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- [Paper note] PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- 论文阅读--PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- PVANET----Deep but Lightweight Neural Networks for Real-time Object Detection论文记录
- 论文笔记:PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection
- 目标检测--Wide-Residual-Inception Networks for Real-time Object Detection
- Deep Neural Networks for Object Detection(基于DNN的对象检测)
- 目标检测--A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
- tensorfolw配置过程中遇到的一些问题及其解决过程的记录(配置SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving)
- 嵌入式目标检测--Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection
- 目标检测-- DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
- 目标检测“Perceptual Generative Adversarial Networks for Small Object Detection”
- 论文阅读:Deep Neural Networks for Object Detection