您的位置:首页 > 其它

[深度学习论文笔记][Image Classification] 图像分类部分论文导读

2016-10-11 08:44 627 查看
[ImageNet]

• Over 15M labeled high resolution images.

• Roughly 22k categories.
• Collected from web and labeled by Amazon Mechanical Turk.

[ILSVRC (ImageNet Large-Scale Visual Recognition Challenge)]

• Annual competition of image classification at large scale.

• 1.2M training images, 50k validation images, and 150k testing images.

• 1k categories.

• Resolution of each image varies.

• Classification: make 5 guesses about the image label (top-5 error).

[Architectures] See Tab. 1.

AlexNet

• Deeper, bigger than LeNet.

• Featured conv layer stacked on top of each other (previously it was common to only have a single conv layer always immediately followed by a pool layer).

• First use of ReLU.

• Heavy data augmentation.
• Dropout.

ZFNet

• Improvement on AlexNet by tweaking the architecture hyperparameters.

• conv1: change from (11 × 11, s4) to (7 × 7, s2).

• conv3,4,5: instead of 384, 384, 256 filters use 512, 1024, 512.

GoogLeNet

• Inception Module that dramatically reduced the number of parameters in the network (4M, compared to AlexNet with 60M).

• Use global average pooling instead of fc.

VGGNet

• Depth of the network is a critical component for good performance.

• 3 × 3 conv and 2 × 2 pool only.

• More parameters (138M).

ResNet

• Skip connections.

• Heavy use of BN.
• Xavier/2 initialization.

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息