您的位置：首页 > 其它

[深度学习]资源汇总

2017-05-08 13:40 369 查看

转自：https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html#t-cnn


Method	VOC2007	VOC2010	VOC2012	ILSVRC 2013	MSCOCO 2015	Speed
OverFeat				24.3%
R-CNN (AlexNet)	58.5%	53.7%	53.3%	31.4%
R-CNN (VGG16)	66.0%
SPP_net(ZF-5)	54.2%(1-model), 60.9%(2-model)			31.84%(1-model), 35.11%(6-model)
DeepID-Net	64.1%			50.3%
NoC	73.3%		68.8%
Fast-RCNN (VGG16)	70.0%	68.8%	68.4%		19.7%(@[0.5-0.95]), 35.9%(@0.5)
MR-CNN	78.2%		73.9%
Faster-RCNN (VGG16)	78.8%		75.9%		21.9%(@[0.5-0.95]), 42.7%(@0.5)	198ms
Faster-RCNN (ResNet-101)	85.6%		83.8%		37.4%(@[0.5-0.95]), 59.0%(@0.5)
SSD300 (VGG16)	77.2%		75.8%		25.1%(@[0.5-0.95]), 43.1%(@0.5)	46 fps
SSD512 (VGG16)	79.8%		78.5%		28.8%(@[0.5-0.95]), 48.5%(@0.5)	19 fps
ION	79.2%		76.4%
CRAFT	75.7%		71.3%	48.5%
OHEM	78.9%		76.3%		25.5%(@[0.5-0.95]), 45.9%(@0.5)
R-FCN (ResNet-50)	77.4%					0.12sec(K40), 0.09sec(TitianX)
R-FCN (ResNet-101)	79.5%					0.17sec(K40), 0.12sec(TitianX)
R-FCN (ResNet-101),multi sc train	83.6%		82.0%		31.5%(@[0.5-0.95]), 53.2%(@0.5)
PVANet 9.0	89.8%		84.2%			750ms(CPU), 46ms(TitianX)

Leaderboard

Detection Results: VOC2012

intro: Competition “comp4” (train on additional data)
homepage: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4

Papers

Deep Neural Networks for Object Detection

paper: http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

arxiv: http://arxiv.org/abs/1312.6229 github: https://github.com/sermanet/OverFeat code: http://cilvr.nyu.edu/doku.php?id=software:overfeat:start

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

intro: R-CNN
arxiv: http://arxiv.org/abs/1311.2524 supp: http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf slides: http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf github: https://github.com/rbgirshick/rcnn notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/ caffe-pr(“Make R-CNN the Caffe detection example”): https://github.com/BVLC/caffe/pull/482

MultiBox

Scalable Object Detection using Deep Neural Networks

intro: first MultiBox. Train a CNN to predict Region of Interest.
arxiv: http://arxiv.org/abs/1312.2249 github: https://github.com/google/multibox blog: https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html Scalable, High-Quality Object Detection

intro: second MultiBox
arxiv: http://arxiv.org/abs/1412.1441 github: https://github.com/google/multibox

SPP-Net

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

intro: ECCV 2014 / TPAMI 2015
arxiv: http://arxiv.org/abs/1406.4729 github: https://github.com/ShaoqingRen/SPP_net notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/

DeepID-Net

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

intro: PAMI 2016
intro: an extension of R-CNN. box pre-training, cascade on region proposals, deformation layers and context representations
project page: http://www.ee.cuhk.edu.hk/%CB%9Cwlouyang/projects/imagenetDeepId/index.html arxiv: http://arxiv.org/abs/1412.5661 Object Detectors Emerge in Deep Scene CNNs

arxiv: http://arxiv.org/abs/1412.6856 paper: https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf paper: https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf slides: http://places.csail.mit.edu/slide_iclr2015.pdf segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

intro: CVPR 2015
project(code+data): https://www.cs.toronto.edu/~yukun/segdeepm.html arxiv: https://arxiv.org/abs/1502.04275 github: https://github.com/YknZhu/segDeepM

NoC

Object Detection Networks on Convolutional Feature Maps

intro: TPAMI 2015
arxiv: http://arxiv.org/abs/1504.06066 Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

arxiv: http://arxiv.org/abs/1504.03293 slides: http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf github: https://github.com/YutingZhang/fgs-obj

Fast R-CNN

Fast R-CNN

arxiv: http://arxiv.org/abs/1504.08083 slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf github: https://github.com/rbgirshick/fast-rcnn github(COCO-branch): https://github.com/rbgirshick/fast-rcnn/tree/coco webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29 notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/ notes: http://blog.csdn.net/linj_m/article/details/48930179 github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn github: https://github.com/mahyarnajibi/fast-rcnn-torch github: https://github.com/apple2373/chainer-simple-fast-rnn github(Tensorflow): https://github.com/zplizzi/tensorflow-fast-rcnn

DeepBox

DeepBox: Learning Objectness with Convolutional Networks

arxiv: http://arxiv.org/abs/1505.02146 github: https://github.com/weichengkuo/DeepBox

MR-CNN

Object detection via a multi-region & semantic segmentation-aware CNN model

intro: ICCV 2015. MR-CNN
arxiv: http://arxiv.org/abs/1505.01749 github: https://github.com/gidariss/mrcnn-object-detection notes: http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/ notes: http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/ my notes: Who can tell me why there are a bunch of duplicated sentences in section 7.2 “Detection error analysis”? :-D

Faster R-CNN

YOLO

You Only Look Once: Unified, Real-Time Object Detection

intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
blog: http://guanghan.info/blog/en/my-works/train-yolo/ github: https://github.com/Guanghan/darknet R-CNN minus R

arxiv: http://arxiv.org/abs/1506.06981

AttentionNet

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

intro: ICCV 2015
intro: state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 human detection task
arxiv: http://arxiv.org/abs/1506.07704 slides: https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf slides: http://image-net.org/challenges/talks/lunit-kaist-slide.pdf

DenseBox

DenseBox: Unifying Landmark Localization with End to End Object Detection

arxiv: http://arxiv.org/abs/1509.04874 demo: http://pan.baidu.com/s/1mgoWWsS KITTI result: http://www.cvlibs.net/datasets/kitti/eval_object.php

SSD

SSD: Single Shot MultiBox Detector

intro: ECCV 2016 Oral
arxiv: http://arxiv.org/abs/1512.02325 paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf slides: http://www.cs.unc.edu/%7Ewliu/papers/ssd_eccv2016_slide.pdf github: https://github.com/weiliu89/caffe/tree/ssd video: http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973 github: https://github.com/zhreshold/mxnet-ssd github: https://github.com/zhreshold/mxnet-ssd.cpp github: https://github.com/rykov8/ssd_keras github: https://github.com/balancap/SSD-Tensorflow github: https://github.com/amdegroot/ssd.pytorch

Inside-Outside Net (ION)

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

intro: “0.8s per image on a Titan X GPU (excluding proposal generation) without two-stage bounding-box regression and 1.15s per image with it”.
arxiv: http://arxiv.org/abs/1512.04143 slides: http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf coco-leaderboard: http://mscoco.org/dataset/#detections-leaderboard Adaptive Object Detection Using Adjacency and Zoom Prediction

intro: CVPR 2016. AZ-Net
arxiv: http://arxiv.org/abs/1512.07711 github: https://github.com/luyongxi/az-net youtube: https://www.youtube.com/watch?v=YmFtuNwxaNM

G-CNN

G-CNN: an Iterative Grid Based Object Detector

arxiv: http://arxiv.org/abs/1512.07729 Factors in Finetuning Deep Model for object detection

Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

intro: CVPR 2016.rank 3rd for provided data and 2nd for external data on ILSVRC 2015 object detection
project page: http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html arxiv: http://arxiv.org/abs/1601.05150 We don’t need no bounding-boxes: Training object class detectors using only human verification

arxiv: http://arxiv.org/abs/1602.08405

HyperNet

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

arxiv: http://arxiv.org/abs/1604.00600

MultiPathNet

A MultiPath Network for Object Detection

intro: BMVC 2016. Facebook AI Research (FAIR)
arxiv: http://arxiv.org/abs/1604.02135 github: https://github.com/facebookresearch/multipathnet

CRAFT

CRAFT Objects from Images

intro: CVPR 2016. Cascade Region-proposal-network And FasT-rcnn. an extension of Faster R-CNN
project page: http://byangderek.github.io/projects/craft.html arxiv: https://arxiv.org/abs/1604.03239 paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf github: https://github.com/byangderek/CRAFT

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

intro: CVPR 2016 Oral. Online hard example mining (OHEM)
arxiv: http://arxiv.org/abs/1604.03540 paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf github（Official）: https://github.com/abhi2610/ohem author page: http://abhinav-shrivastava.info/ Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

intro: CVPR 2016
arxiv: http://arxiv.org/abs/1604.05766 Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

intro: scale-dependent pooling (SDP), cascaded rejection clas-sifiers (CRC)
paper: http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

arxiv: http://arxiv.org/abs/1605.06409 github: https://github.com/daijifeng001/R-FCN github: https://github.com/Orpine/py-R-FCN github(PyTorch): https://github.com/PureDiors/pytorch_RFCN github: https://github.com/bharatsingh430/py-R-FCN-multiGPU Weakly supervised object detection using pseudo-strong labels

arxiv: http://arxiv.org/abs/1607.04731 Recycle deep features for better object detection

arxiv: http://arxiv.org/abs/1607.05066

MS-CNN

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

intro: ECCV 2016
intro: 640×480: 15 fps, 960×720: 8 fps
arxiv: http://arxiv.org/abs/1607.07155 github: https://github.com/zhaoweicai/mscnn poster: http://www.eccv2016.org/files/posters/P-2B-38.pdf Multi-stage Object Detection with Group Recursive Learning

intro: VOC2007: 78.6%, VOC2012: 74.9%
arxiv: http://arxiv.org/abs/1608.05159 Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

intro: WACV 2017. SubCNN
arxiv: http://arxiv.org/abs/1604.04693 github: https://github.com/yuxng/SubCNN

PVANET

PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

intro: “less channels with more layers”, concatenated ReLU, Inception, and HyperNet, batch normalization, residual connections
arxiv: http://arxiv.org/abs/1608.08021 github: https://github.com/sanghoon/pva-faster-rcnn leaderboard(PVANet 9.0): http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4 PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation ofarXiv:1608.08021
arxiv: https://arxiv.org/abs/1611.08588

GBD-Net

Gated Bi-directional CNN for Object Detection

intro: The Chinese University of Hong Kong & Sensetime Group Limited
paper: http://link.springer.com/chapter/10.1007/978-3-319-46478-7_22 mirror: https://pan.baidu.com/s/1dFohO7v Crafting GBD-Net for Object Detection

intro: winner of the ImageNet object detection challenge of 2016. CUImage and CUVideo
intro: gated bi-directional CNN (GBD-Net)
arxiv: https://arxiv.org/abs/1610.02579 github: https://github.com/craftGBD/craftGBD

StuffNet

StuffNet: Using ‘Stuff’ to Improve Object Detection

arxiv: https://arxiv.org/abs/1610.05861 Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

arxiv: https://arxiv.org/abs/1610.09609 Hierarchical Object Detection with Deep Reinforcement Learning

intro: Deep Reinforcement Learning Workshop (NIPS 2016)
project page: https://imatge-upc.github.io/detection-2016-nipsws/ arxiv: https://arxiv.org/abs/1611.03718 slides: http://www.slideshare.net/xavigiro/hierarchical-object-detection-with-deep-reinforcement-learning github: https://github.com/imatge-upc/detection-2016-nipsws blog: http://jorditorres.org/nips/ Learning to detect and localize many objects from few examples

arxiv: https://arxiv.org/abs/1611.05664 Speed/accuracy trade-offs for modern convolutional object detectors

intro: Google Research
arxiv: https://arxiv.org/abs/1611.10012 SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

arxiv: https://arxiv.org/abs/1612.01051 github: https://github.com/BichenWuUCB/squeezeDet

Feature Pyramid Network (FPN)

Feature Pyramid Networks for Object Detection

intro: Facebook AI Research
arxiv: https://arxiv.org/abs/1612.03144 Action-Driven Object Detection with Top-Down Visual Attentions

arxiv: https://arxiv.org/abs/1612.06704 Beyond Skip Connections: Top-Down Modulation for Object Detection

intro: CMU & UC Berkeley & Google Research
arxiv: https://arxiv.org/abs/1612.06851

YOLOv2

YOLO9000: Better, Faster, Stronger

arxiv: https://arxiv.org/abs/1612.08242 code: http://pjreddie.com/yolo9000/ github(Chainer): https://github.com/leetenki/YOLOv2 github(Keras): https://github.com/allanzelener/YAD2K github(PyTorch): https://github.com/longcw/yolo2-pytorch github(Tensorflow): https://github.com/hizhangp/yolo_tensorflow github(Windows): https://github.com/AlexeyAB/darknet github: https://github.com/choasUp/caffe-yolo9000 Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2

github: https://github.com/AlexeyAB/Yolo_mark

DSSD

DSSD : Deconvolutional Single Shot Detector

intro: UNC Chapel Hill & Amazon Inc
arxiv: https://arxiv.org/abs/1701.06659 Wide-Residual-Inception Networks for Real-time Object Detection

intro: Inha University
arxiv: https://arxiv.org/abs/1702.01243 Attentional Network for Visual Object Detection

intro: University of Maryland & Mitsubishi Electric Research Laboratories
arxiv: https://arxiv.org/abs/1702.01478

CC-Net

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

intro: chained cascade network (CC-Net). 81.1% mAP on PASCAL VOC 2007
arxiv: https://arxiv.org/abs/1702.07054 DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling

https://arxiv.org/abs/1703.10295

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

intro: CVPR 2017
paper: http://abhinavsh.info/papers/pdfs/adversarial_object_detection.pdf github(Caffe): https://github.com/xiaolonw/adversarial-frcnn Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03944 Spatial Memory for Context Reasoning in Object Detection

arxiv: https://arxiv.org/abs/1704.04224 Improving Object Detection With One Line of Code

intro: University of Maryland
keywords: Soft-NMS
arxiv: https://arxiv.org/abs/1704.04503 github: https://github.com/bharatsingh430/soft-nms Accurate Single Stage Detector Using Recurrent Rolling Convolution

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.05776 github: https://github.com/xiaohaoChen/rrc_detection Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

https://arxiv.org/abs/1704.05775

Detection From Video

Learning Object Class Detectors from Weakly Annotated Video

intro: CVPR 2012
paper: https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf Analysing domain shift factors between videos and images for object detection

arxiv: https://arxiv.org/abs/1501.01186 Video Object Recognition

slides: http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video%20Object%20Recognition.pptx Deep Learning for Saliency Prediction in Natural Video

intro: Submitted on 12 Jan 2016
keywords: Deep learning, saliency map, optical flow, convolution network, contrast features
paper: https://hal.archives-ouvertes.fr/hal-01251614/document

T-CNN

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

intro: Winning solution in ILSVRC2015 Object Detection from Video(VID) Task
arxiv: http://arxiv.org/abs/1604.02532 github: https://github.com/myfavouritekk/T-CNN Object Detection from Video Tubelets with Convolutional Neural Networks

intro: CVPR 2016 Spotlight paper
arxiv: https://arxiv.org/abs/1604.04053 paper: http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf gihtub: https://github.com/myfavouritekk/vdetlib Object Detection in Videos with Tubelets and Multi-context Cues

intro: SenseTime Group
slides: http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf slides: http://image-net.org/challenges/talks/Object%20Detection%20in%20Videos%20with%20Tubelets%20and%20Multi-context%20Cues%20-%20Final.pdf Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

intro: BMVC 2016
keywords: pseudo-labeler
arxiv: http://arxiv.org/abs/1607.04648 paper: http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf CNN Based Object Detection in Large Video Images

intro: WangTao @ 爱奇艺
keywords: object retrieval, object detection, scene classification
slides: http://on-demand.gputechconf.com/gtc/2016/presentation/s6362-wang-tao-cnn-based-object-detection-large-video-images.pdf Object Detection in Videos with Tubelet Proposal Networks

arxiv: https://arxiv.org/abs/1702.06355 Flow-Guided Feature Aggregation for Video Object Detection

intro: MSRA
arxiv: https://arxiv.org/abs/1703.10025 Video Object Detection using Faster R-CNN

blog: http://andrewliao11.github.io/object_detection/faster_rcnn/ github: https://github.com/andrewliao11/py-faster-rcnn-imagenet

Object Detection in 3D

Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks

arxiv: https://arxiv.org/abs/1609.06666

Object Detection on RGB-D

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

arxiv: http://arxiv.org/abs/1407.5736 Differential Geometry Boosts Convolutional Neural Networks for Object Detection

intro: CVPR 2016
paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w23/html/Wang_Differential_Geometry_Boosts_CVPR_2016_paper.html A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation

https://arxiv.org/abs/1703.03347

Salient Object Detection

This task involves predicting the salient regions of an image given by human eye fixations.

Best Deep Saliency Detection Models (CVPR 2016 & 2015)

http://i.cs.hku.hk/~yzyu/vision.html

Large-scale optimization of hierarchical features for saliency prediction in natural images

paper: http://coxlab.org/pdfs/cvpr2014_vig_saliency.pdf Predicting Eye Fixations using Convolutional Neural Networks

paper: http://www.escience.cn/system/file?fileId=72648 Saliency Detection by Multi-Context Deep Learning

paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhao_Saliency_Detection_by_2015_CVPR_paper.pdf DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

arxiv: http://arxiv.org/abs/1510.05484 SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

paper:
www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html
Shallow and Deep Convolutional Networks for Saliency Prediction

arxiv: http://arxiv.org/abs/1603.00845 github: https://github.com/imatge-upc/saliency-2016-cvpr Recurrent Attentional Networks for Saliency Detection

intro: CVPR 2016. recurrent attentional convolutional-deconvolution network (RACDNN)
arxiv: http://arxiv.org/abs/1604.03227 Two-Stream Convolutional Networks for Dynamic Saliency Prediction

arxiv: http://arxiv.org/abs/1607.04730 Unconstrained Salient Object Detection

Unconstrained Salient Object Detection via Proposal Subset Optimization

intro: CVPR 2016
project page: http://cs-people.bu.edu/jmzhang/sod.html paper: http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf github: https://github.com/jimmie33/SOD caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection

paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DHSNet_Deep_Hierarchical_CVPR_2016_paper.pdf Salient Object Subitizing

intro: CVPR 2015
intro: predicting the existence and the number of salient objects in an image using holistic cues
project page: http://cs-people.bu.edu/jmzhang/sos.html arxiv: http://arxiv.org/abs/1607.07525 paper: http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection

intro: ACMMM 2016. deeply-supervised recurrent convolutional neural network (DSRCNN)
arxiv: http://arxiv.org/abs/1608.05177 Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1608.05186 Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

arxiv: http://arxiv.org/abs/1608.08029 A Deep Multi-Level Network for Saliency Prediction

arxiv: http://arxiv.org/abs/1609.01064 Visual Saliency Detection Based on Multiscale Deep CNN Features

intro: IEEE Transactions on Image Processing
arxiv: http://arxiv.org/abs/1609.02077 A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

intro: DSCLRCN
arxiv: https://arxiv.org/abs/1610.01708 Deeply supervised salient object detection with short connections

arxiv: https://arxiv.org/abs/1611.04849 Weakly Supervised Top-down Salient Object Detection

intro: Nanyang Technological University
arxiv: https://arxiv.org/abs/1611.05345 SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

project page: https://imatge-upc.github.io/saliency-salgan-2017/ arxiv: https://arxiv.org/abs/1701.01081 Visual Saliency Prediction Using a Mixture of Deep Neural Networks

arxiv: https://arxiv.org/abs/1702.00372 A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network

arxiv: https://arxiv.org/abs/1702.00615 Saliency Detection by Forward and Backward Cues in Deep-CNNs

https://arxiv.org/abs/1703.00152

Supervised Adversarial Networks for Image Saliency Detection

https://arxiv.org/abs/1704.07242

Saliency Detection in Video

Deep Learning For Video Saliency Detection

arxiv: https://arxiv.org/abs/1702.00871

Visual Relationship Detection

Visual Relationship Detection with Language Priors

intro: ECCV 2016 oral
paper: https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf github: https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection

intro: Visual Phrase reasoning Convolutional Neural Network (ViP-CNN), Visual Phrase Reasoning Structure (VPRS)
arxiv: https://arxiv.org/abs/1702.07191 Visual Translation Embedding Network for Visual Relation Detection

arxiv: https://www.arxiv.org/abs/1702.08319 Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

intro: CVPR 2017 spotlight paper
arxiv: https://arxiv.org/abs/1703.03054 Detecting Visual Relationships with Deep Relational Networks

intro: CVPR 2017 oral. The Chinese University of Hong Kong
arxiv: https://arxiv.org/abs/1704.03114

Specific Object Deteciton

Face Deteciton

Multi-view Face Detection Using Deep Convolutional Neural Networks

intro: Yahoo
arxiv: http://arxiv.org/abs/1502.02766 github: https://github.com/guoyilin/FaceDetection_CNN From Facial Parts Responses to Face Detection: A Deep Learning Approach

project page: http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html Compact Convolutional Neural Network Cascade for Face Detection

arxiv: http://arxiv.org/abs/1508.01292 github: https://github.com/Bkmz21/FD-Evaluation Face Detection with End-to-End Integration of a ConvNet and a 3D Model

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1606.00850 github(MXNet): https://github.com/tfwu/FaceDetection-ConvNet-3D CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

intro: CMU
arxiv: https://arxiv.org/abs/1606.05413 Finding Tiny Faces

intro: CMU
project page: http://www.cs.cmu.edu/~peiyunh/tiny/index.html arxiv: https://arxiv.org/abs/1612.04402 github: https://github.com/peiyunh/tiny Towards a Deep Learning Framework for Unconstrained Face Detection

intro: overlap with CMS-RCNN
arxiv: https://arxiv.org/abs/1612.05322 Supervised Transformer Network for Efficient Face Detection

arxiv: http://arxiv.org/abs/1607.05477

UnitBox

UnitBox: An Advanced Object Detection Network

intro: ACM MM 2016
arxiv: http://arxiv.org/abs/1608.01471 Bootstrapping Face Detection with Hard Negative Examples

author: 万韶华 @ 小米.
intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset
arxiv: http://arxiv.org/abs/1608.02236 Grid Loss: Detecting Occluded Faces

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1609.00129 paper: http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf poster: http://www.eccv2016.org/files/posters/P-2A-34.pdf A Multi-Scale Cascade Fully Convolutional Network Face Detector

intro: ICPR 2016
arxiv: http://arxiv.org/abs/1609.03536

MTCNN

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

project page: https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html arxiv: https://arxiv.org/abs/1604.02878 github(Matlab): https://github.com/kpzhang93/MTCNN_face_detection_alignment github: https://github.com/pangyupo/mxnet_mtcnn_face_detection github: https://github.com/DaFuCoding/MTCNN_Caffe github(MXNet): https://github.com/Seanlinx/mtcnn github: https://github.com/Pi-DeepLearning/RaspberryPi-FaceDetection-MTCNN-Caffe-With-Motion github(Caffe): https://github.com/foreverYoungGitHub/MTCNN github: https://github.com/CongWeilin/mtcnn-caffe Face Detection using Deep Learning: An Improved Faster RCNN Approach

intro: DeepIR Inc
arxiv: https://arxiv.org/abs/1701.08289 Faceness-Net: Face Detection through Deep Facial Part Responses

intro: An extended version of ICCV 2015 paper
arxiv: https://arxiv.org/abs/1701.08393 Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”

intro: CVPR 2017. MP-RCNN, MP-RPN
arxiv: https://arxiv.org/abs/1703.09145 End-To-End Face Detection and Recognition

https://arxiv.org/abs/1703.10818

Facial Point / Landmark Detection

Deep Convolutional Network Cascade for Facial Point Detection

homepage: http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm paper: http://www.ee.cuhk.edu.hk/~xgwang/papers/sunWTcvpr13.pdf github: https://github.com/luoyetx/deep-landmark Facial Landmark Detection by Deep Multi-task Learning

intro: ECCV 2014
project page: http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html paper: http://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2014_deepfacealign.pdf github(Matlab): https://github.com/zhzhanp/TCDCN-face-alignment A Recurrent Encoder-Decoder Network for Sequential Face Alignment

intro: ECCV 2016
arxiv: https://arxiv.org/abs/1608.05477 Detecting facial landmarks in the video based on a hybrid framework

arxiv: http://arxiv.org/abs/1609.06441 Deep Constrained Local Models for Facial Landmark Detection

arxiv: https://arxiv.org/abs/1611.08657 Effective face landmark localization via single deep network

arxiv: https://arxiv.org/abs/1702.02719 A Convolution Tree with Deconvolution Branches: Exploiting Geometric Relationships for Single Shot Keypoint Detection

https://arxiv.org/abs/1704.01880

People Detection

End-to-end people detection in crowded scenes

arxiv: http://arxiv.org/abs/1506.04878 github: https://github.com/Russell91/reinspect ipn: http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb Detecting People in Artwork with CNNs

intro: ECCV 2016 Workshops
arxiv: https://arxiv.org/abs/1610.08871 Deep Multi-camera People Detection

arxiv: https://arxiv.org/abs/1702.04593

Person Head Detection

Context-aware CNNs for person head detection

arxiv: http://arxiv.org/abs/1511.07917 github: https://github.com/aosokin/cnn_head_detection

Pedestrian Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

intro: CVPR 2015
project page: http://mmlab.ie.cuhk.edu.hk/projects/TA-CNN/ paper: http://arxiv.org/abs/1412.0069 Deep Learning Strong Parts for Pedestrian Detection

intro: ICCV 2015. CUHK. DeepParts
intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset
paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf Deep convolutional neural networks for pedestrian detection

arxiv: http://arxiv.org/abs/1510.03608 github: https://github.com/DenisTome/DeepPed Scale-aware Fast R-CNN for Pedestrian Detection

arxiv: https://arxiv.org/abs/1510.08160 New algorithm improves speed and accuracy of pedestrian detection

blog: http://www.eurekalert.org/pub_releases/2016-02/uoc–nai020516.php Pushing the Limits of Deep CNNs for Pedestrian Detection

intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
arxiv: http://arxiv.org/abs/1603.04525 A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

arxiv: http://arxiv.org/abs/1607.04436 A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

arxiv: http://arxiv.org/abs/1607.04441 Is Faster R-CNN Doing Well for Pedestrian Detection?

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1607.07032 github: https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian Reduced Memory Region Based Deep Convolutional Neural Network Detection

intro: IEEE 2016 ICCE-Berlin
arxiv: http://arxiv.org/abs/1609.02500 Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

arxiv: https://arxiv.org/abs/1610.03466 Multispectral Deep Neural Networks for Pedestrian Detection

intro: BMVC 2016 oral
arxiv: https://arxiv.org/abs/1611.02644 Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters

intro: CVPR 2017
project page: http://ml.cs.tsinghua.edu.cn:5000/publications/synunity/ arxiv: https://arxiv.org/abs/1703.06283 github(Tensorflow): https://github.com/huangshiyu13/RPNplus

Vehicle Detection

DAVE: A Unified Framework for Fast Vehicle Detection and Annotation

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1607.04564 Evolving Boxes for fast Vehicle Detection

arxiv: https://arxiv.org/abs/1702.00254

Traffic-Sign Detection

Traffic-Sign Detection and Classification in the Wild

project page(code+dataset): http://cg.cs.tsinghua.edu.cn/traffic-sign/ paper: http://120.52.73.11/www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf code & model: http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip

Boundary / Edge / Contour Detection

Holistically-Nested Edge Detection

intro: ICCV 2015, Marr Prize
paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf arxiv: http://arxiv.org/abs/1504.06375 github: https://github.com/s9xie/hed Unsupervised Learning of Edges

intro: CVPR 2016. Facebook AI Research
arxiv: http://arxiv.org/abs/1511.04166 zn-blog: http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html Pushing the Boundaries of Boundary Detection using Deep Learning

arxiv: http://arxiv.org/abs/1511.07386 Convolutional Oriented Boundaries

intro: ECCV 2016
arxiv: http://arxiv.org/abs/1608.02755 Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks

project page: http://www.vision.ee.ethz.ch/~cvlsegmentation/ arxiv: https://arxiv.org/abs/1701.04658 github: https://github.com/kmaninis/COB Richer Convolutional Features for Edge Detection

intro: richer convolutional features (RCF)
arxiv: https://arxiv.org/abs/1612.02103

Skeleton Detection

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

arxiv: http://arxiv.org/abs/1603.09446 github: https://github.com/zeakey/DeepSkeleton DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

arxiv: http://arxiv.org/abs/1609.03659 SRN: Side-output Residual Network for Object Symmetry Detection in the Wild

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1703.02243 github: https://github.com/KevinKecc/SRN

Fruit Detection

Deep Fruit Detection in Orchards

arxiv: https://arxiv.org/abs/1610.03677 Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards

intro: The Journal of Field Robotics in May 2016
project page: http://confluence.acfr.usyd.edu.au/display/AGPub/ arxiv: https://arxiv.org/abs/1610.08120

Part Detection

Objects as context for part detection

https://arxiv.org/abs/1703.09529

Others

Deep Deformation Network for Object Landmark Localization

arxiv: http://arxiv.org/abs/1605.01014 Fashion Landmark Detection in the Wild

intro: ECCV 2016
project page: http://personal.ie.cuhk.edu.hk/~lz013/projects/FashionLandmarks.html arxiv: http://arxiv.org/abs/1608.03049 github(Caffe): https://github.com/liuziwei7/fashion-landmarks Deep Learning for Fast and Accurate Fashion Item Detection

intro: Kuznech Inc.
intro: MultiBox and Fast R-CNN
paper: https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)

github: https://github.com/geometalab/OSMDeepOD Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

intro: IEEE SITIS 2016
arxiv: https://arxiv.org/abs/1611.04357 Associative Embedding:End-to-End Learning for Joint Detection and Grouping

arxiv: https://arxiv.org/abs/1611.05424 Deep Cuboid Detection: Beyond 2D Bounding Boxes

intro: CMU & Magic Leap
arxiv: https://arxiv.org/abs/1611.10010 Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

arxiv: https://arxiv.org/abs/1612.03019 Deep Learning Logo Detection with Data Expansion by Synthesising Context

arxiv: https://arxiv.org/abs/1612.09322 Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks

arxiv: https://arxiv.org/abs/1702.00307 Automatic Handgun Detection Alarm in Videos Using Deep Learning

arxiv: https://arxiv.org/abs/1702.05147 results: https://github.com/SihamTabik/Pistol-Detection-in-Videos

Object Proposal

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

arxiv: http://arxiv.org/abs/1510.04445 github: https://github.com/aghodrati/deepproposal Scale-aware Pixel-wise Object Proposal Networks

intro: IEEE Transactions on Image Processing
arxiv: http://arxiv.org/abs/1601.04798 Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

intro: BMVC 2016. AttractioNet
arxiv: https://arxiv.org/abs/1606.04446 github: https://github.com/gidariss/AttractioNet Learning to Segment Object Proposals via Recursive Neural Networks

arxiv: https://arxiv.org/abs/1612.01057 Learning Detection with Diverse Proposals

intro: CVPR 2017
keywords: differentiable Determinantal Point Process (DPP) layer, Learning Detection with Diverse Proposals (LDDP)
arxiv: https://arxiv.org/abs/1704.03533 ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond

keywords: product detection
arxiv: https://arxiv.org/abs/1704.06752 Improving Small Object Proposals for Company Logo Detection

intro: ICMR 2017
arxiv: https://arxiv.org/abs/1704.08881

Localization

Beyond Bounding Boxes: Precise Localization of Objects in Images

intro: PhD Thesis
homepage: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html phd-thesis: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf github(“SDS using hypercolumns”): https://github.com/bharath272/sds Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

arxiv: http://arxiv.org/abs/1503.00949 Weakly Supervised Object Localization Using Size Estimates

arxiv: http://arxiv.org/abs/1608.04314 Active Object Localization with Deep Reinforcement Learning

intro: ICCV 2015
keywords: Markov Decision Process
arxiv: https://arxiv.org/abs/1511.06015 Localizing objects using referring expressions

intro: ECCV 2016
keywords: LSTM, multiple instance learning (MIL)
paper: http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf github: https://github.com/varun-nagaraja/referring-expressions LocNet: Improving Localization Accuracy for Object Detection

arxiv: http://arxiv.org/abs/1511.07763 github: https://github.com/gidariss/LocNet Learning Deep Features for Discriminative Localization

homepage: http://cnnlocalization.csail.mit.edu/ arxiv: http://arxiv.org/abs/1512.04150 github(Tensorflow): https://github.com/jazzsaxmafia/Weakly_detector github: https://github.com/metalbubble/CAM github: https://github.com/tdeboissiere/VGG16CAM-keras ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

intro: ECCV 2016
project page: http://www.di.ens.fr/willow/research/contextlocnet/ arxiv: http://arxiv.org/abs/1609.04331 github: https://github.com/vadimkantorov/contextlocnet

Tutorials / Talks

Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection

slides: http://research.microsoft.com/en-us/um/people/kahe/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf Towards Good Practices for Recognition & Detection

intro: Hikvision Research Institute. Supervised Data Augmentation (SDA)
slides: http://image-net.org/challenges/talks/2016/Hikvision_at_ImageNet_2016.pdf

Projects

TensorBox: a simple framework for training neural networks to detect objects in images

intro: “The basic model implements the simple and robust GoogLeNet-OverFeat algorithm. We additionally provide an implementation of theReInspect algorithm”
github: https://github.com/Russell91/TensorBox Object detection in torch: Implementation of some object detection frameworks in torch

github: https://github.com/fmassa/object-detection.torch Using DIGITS to train an Object Detection network

github: https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md FCN-MultiBox Detector

intro: Full convolution MultiBox Detector (like SSD) implemented in Torch.
github: https://github.com/teaonly/FMD.torch KittiBox: A car detection model implemented in Tensorflow.

keywords: MultiNet
intro: KittiBox is a collection of scripts to train out model FastBox on the Kitti Object Detection Dataset
github: https://github.com/MarvinTeichmann/KittiBox

Tools

BeaverDam: Video annotation tool for deep learning training labels

https://github.com/antingshen/BeaverDam

Blogs

Convolutional Neural Networks for Object Detection

http://rnd.azoft.com/convolutional-neural-networks-object-detection/

Introducing automatic object detection to visual search (Pinterest)

keywords: Faster R-CNN
blog: https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search demo: https://engineering.pinterest.com/sites/engineering/files/Visual%20Search%20V1%20-%20Video.mp4 review: https://news.developer.nvidia.com/pinterest-introduces-the-future-of-visual-search/?mkt_tok=eyJpIjoiTnpaa01UWXpPRE0xTURFMiIsInQiOiJJRjcybjkwTmtmallORUhLOFFFODBDclFqUlB3SWlRVXJXb1MrQ013TDRIMGxLQWlBczFIeWg0TFRUdnN2UHY2ZWFiXC9QQVwvQzBHM3B0UzBZblpOSmUyU1FcLzNPWXI4cml2VERwTTJsOFwvOEk9In0%3D Deep Learning for Object Detection with DIGITS

blog: https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/ Analyzing The Papers Behind Facebook’s Computer Vision Approach

keywords: DeepMask, SharpMask, MultiPathNet
blog: https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/ Easily Create High Quality Object Detectors with Deep Learning

intro: dlib v19.2
blog: http://blog.dlib.net/2016/10/easily-create-high-quality-object.html How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit

blog: https://blogs.technet.microsoft.com/machinelearning/2016/10/25/how-to-train-a-deep-learned-object-detection-model-in-cntk/ github: https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN Object Detection in Satellite Imagery, a Low Overhead Approach

part 1: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9 part 2: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64 You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks

part 1: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of part 2: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t Faster R-CNN Pedestrian and Car Detection

blog: https://bigsnarf.wordpress.com/2016/11/07/faster-r-cnn-pedestrian-and-car-detection/ ipn: https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb github: https://github.com/bigsnarfdude/Faster-RCNN_TF Small U-Net for vehicle detection

blog: https://medium.com/@vivek.yadav/small-u-net-for-vehicle-detection-9eec216f9fd6#.md4u80kad Region of interest pooling explained

blog: https://deepsense.io/region-of-interest-pooling-explained/ github: https://github.com/deepsense-io/roi-pooling

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航