您的位置：首页 > 移动开发 > Objective-C

[深度学习论文笔记][Object Detection] Faster R-CNN: Towards Real-Time Object

2016-11-10 21:14 676 查看

Ren, Shaoqing, et al. “Faster R-CNN: Towards real-time object detection with region proposal networks.” Advances in neural information processing systems. 2015. (Citations:

444).

1 Motivation

Region proposals are the test-time computational bottleneck in state-of-the-art detection systems.

We solve this issue by inserting a Region Proposal Network (RPN) after the conv5 layer to produce region proposals directly. Thus, there is no need for external region proposals. After RPN, use RoI pooling and an upstream classifier and bounding box regressor
just like Fast R-CNN. See Fig.

2 Region Proposal Network
Slide a small window (3 × 3 in our case) on the conv5 feature map. For each window, we simultaneously predict multiple region proposals with a wide range of scales and aspect ratios, where the number of maximum possible proposals for each location is denoted
as N.

We generate region proposals by building a two-layer network. See Fig. The classification head outputs 2N scores that estimate probability of object or not object for each
proposal. The regression head has 4N outputs encoding the offsets to N reference boxes, which we call anchors. By default we use 3 scales and 3 aspect ratios, yielding N = 9 anchors at each sliding position. For a conv5 feature map of a size H × W, there
are HWN anchors in total.

Anchors are translation invariant, both in terms of the anchors and the functions that compute proposals relative to the anchors. If one translates an object in an image, the

proposal should translate and the same function should be able to predict the proposal in either location.

3 Training Details

We assign a positive label to two kinds of anchors:

• The anchor with the highest IoU overlap with a ground-truth box.

• The anchor that has an IoU overlap higher than 0.7 with any ground-truth box.

We adopt the first condition for the reason that in some rare cases the second condition may find no positive sample. a

We assign a negative label to a non-positive anchor if its IoU ratio is lower than 0.3 for all ground-truth boxes. Anchors that are neither positive nor negative do not contribute to the training objective.

We joint training the whole network, it has four losses.

• RPN classification (anchor good/bad).

• RPN regression (anchor → proposal).

• Fast R-CNN classification (over classes).

• Fast R-CNN regression (proposal → box).

4 Result

See Tab.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： Papers Computer Vision CNN Deep Learning Object Detection

相关文章推荐

新的分享

章节导航