您的位置：首页 > 其它

人群密度估计--CrowdNet: A Deep Convolutional Network for Dense Crowd Counting

2017-09-28 14:26 537 查看

CrowdNet: A Deep Convolutional Network for Dense Crowd Counting

published in the proceedings of ACM Conference on Multimedia (ACMMM) - 2016

http://val.serc.iisc.ernet.in/CrowdNet/

Caffe: https://github.com/davideverona/deep-crowd-counting_crowdnet

针对人群密度估计问题，本文使用 deep and shallow, fully convolutional networks 两个网络相结合实现 large scale variations，

high-level semantic information (face/body detectors) and the low-level features (blob detectors)

我们的网络结构如下所示：

Deep Network 主要用捕获 high-level semantics 信息，这里我们采用一个类似 VGG网络的结构，我们去掉了全连接层，网络变成了全卷积层。同时原来的 VGG网络使用了5个 max-pool layers each with a stride of 2，最终的特征图大小只有输入图像尺寸的1/32。我们这里需要输出像素级别的人群密度估计图，所以我们 set the stride of the

fourth max-pool layer to 1 and remove the fifth pooling layer，这样最终的特征图大小只有输入图像尺寸的 1/8.

the receptive-field mismatch caused by the removal of stride in the fourth max-pool layer

将第四最大池化层的步长设置为1会导致 the receptive-field mismatch，这里我们使用了文献【4】中的膨胀卷积。其结果就相当第四最大池化层的步长设置为2

Shallow Network

这里我们使用一个 shallow convolutional network 主要用于检测远离相机的人头， used for the detection of small head-blobs

Combination of Deep and Shallow Networks

这里 concatenate Deep and Shallow Networks 的输出，输入图像尺寸的 1/8，使用一个 1x1 convolution layer，再 upsampled to the size of the input image using bilinear interpolation to obtain the final crowd density prediction

3.2 Ground Truth

generate our ground truth by simply blurring each head annotation using a Gaussian kernel normalized to sum to one

3.3 Data Augmentation

这里主要使用两类数据增强

primarily perform two types of augmentation

1）对 scale variations 我们多尺度采样

2）对容易错误的样本我们多训练几次

sampling high density patches more often

4 EXPERIMENTS

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航