您的位置：首页 > 理论基础 > 计算机网络

caffe神经网络构建、参数设置

2017-05-23 13:59 405 查看

转自：http://www.cnblogs.com/denny402/p/5070928.html

要运行caffe，需要先创建一个模型（model)，如比较常用的Lenet,Alex等，而一个模型由多个屋（layer）构成，每一屋又由许多参数组成。所有的参数都定义在caffe.proto这个文件中。要熟练使用caffe，最重要的就是学会配置文件（prototxt）的编写。

层有很多种类型，比如Data,Convolution,Pooling等，层之间的数据流动是以Blobs的方式进行。

今天我们就先介绍一下数据层.

数据层是每个模型的最底层，是模型的入口，不仅提供数据的输入，也提供数据从Blobs转换成别的格式进行保存输出。通常数据的预处理（如减去均值, 放大缩小, 裁剪和镜像等），也在这一层设置参数实现。

数据来源可以来自高效的数据库（如LevelDB和LMDB），也可以直接来自于内存。如果不是很注重效率的话，数据也可来自磁盘的hdf5文件和图片格式文件。

所有的数据层的都具有的公用参数：先看示例

layer {
name: "cifar"
type: "Data"
top: "data"
top: "label"
phase: TRAIN
}
transform_param {
mean_file: "examples/cifar10/mean.binaryproto"
}
data_param {
source: "examples/cifar10/cifar10_train_lmdb"
batch_size: 100
backend: LMDB
}
}

name: 表示该层的名称，可随意取

type: 层类型，如果是Data，表示数据来源于LevelDB或LMDB。根据数据的来源不同，数据层的类型也不同（后面会详细阐述）。一般在练习的时候，我们都是采用的LevelDB或LMDB数据，因此层类型设置为Data。

top或bottom: 每一层用bottom来输入数据，用top来输出数据。如果只有top没有bottom，则此层只有输出，没有输入。反之亦然。如果有多个 top或多个bottom，表示有多个blobs数据的输入和输出。

data 与 label: 在数据层中，至少有一个命名为data的top。如果有第二个top，一般命名为label。这种(data,label)配对是分类模型所必需的。

include: 一般训练的时候和测试的时候，模型的层是不一样的。该层（layer）是属于训练阶段的层，还是属于测试阶段的层，需要用include来指定。如果没有include参数，则表示该层既在训练模型中，又在测试模型中。

Transformations: 数据的预处理，可以将数据变换到定义的范围内。如设置scale为0.00390625，实际上就是1/255, 即将输入数据由0-255归一化到0-1之间

其它的数据预处理也在这个地方设置：

transform_param {
scale: 0.00390625
mean_file_size: "examples/cifar10/mean.binaryproto"
# 用一个配置文件来进行均值操作
mirror: 1  # 1表示开启镜像，0表示关闭，也可用ture和false来表示
# 剪裁一个 227*227的图块，在训练阶段随机剪裁，在测试阶段从中间裁剪
crop_size: 227
}

后面的data_param部分，就是根据数据的来源不同，来进行不同的设置。

1、数据来自于数据库（如LevelDB和LMDB）

层类型（layer type）:Data

必须设置的参数：

source: 包含数据库的目录名称，如examples/mnist/mnist_train_lmdb

batch_size: 每次处理的数据个数，如64

可选的参数：

rand_skip: 在开始的时候，路过某个数据的输入。通常对异步的SGD很有用。

backend: 选择是采用LevelDB还是LMDB, 默认是LevelDB.

示例：

layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}

2、数据来自于内存

层类型：MemoryData

必须设置的参数：

batch_size

：每一次处理的数据个数，比如2

channels

：通道数

height

：高度

width: 宽度

示例：

layer {
top: "data"
top: "label"
name: "memory_data"
type: "MemoryData"
memory_data_param{
batch_size: 2
height: 100
width: 100
channels: 1
}
transform_param {
scale: 0.0078125
mean_file: "mean.proto"
mirror: false
}
}

3、数据来自于HDF5

层类型：HDF5Data

必须设置的参数：

source: 读取的文件名称

batch_size: 每一次处理的数据个数

示例：

layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "examples/hdf5_classification/data/train.txt"
batch_size: 10
}
}

4、数据来自于图片

层类型：ImageData

必须设置的参数：

source: 一个文本文件的名字，每一行给定一个图片文件的名称和标签（label)

batch_size: 每一次处理的数据个数，即图片数

可选参数：

rand_skip: 在开始的时候，路过某个数据的输入。通常对异步的SGD很有用。

shuffle: 随机打乱顺序，默认值为false

new_height,new_width: 如果设置，则将图片进行resize

示例：

layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
transform_param {
mirror: false
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
image_data_param {
source: "examples/_temp/file_list.txt"
batch_size: 50
new_height: 256
new_width: 256
}
}

5、数据来源于Windows

层类型：WindowData

必须设置的参数：

source: 一个文本文件的名字

batch_size: 每一次处理的数据个数，即图片数

示例：

layer {
name: "data"
type: "WindowData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
window_data_param {
source: "examples/finetune_pascal_detection/window_file_2007_trainval.txt"
batch_size: 128
fg_threshold: 0.5
bg_threshold: 0.5
fg_fraction: 0.25
context_pad: 16
crop_mode: "warp"
}
}

转自：http://www.cnblogs.com/lutingting/p/5240629.html

在caffe中，网络的结构由prototxt文件中给出，由一些列的Layer（层）组成，常用的层如：数据加载层、卷积操作层、pooling层、非线性变换层、内积运算层、归一化层、损失计算层等；本篇主要介绍卷积层

参考

1. 卷积层总述

下面首先给出卷积层的结构设置的一个小例子（定义在.prototxt文件中）

layer {

name: "conv1" // 该层的名字
type: "Convolution" // 该层的类型，具体地，可选的类型有：Convolution、
bottom: "data" // 该层的输入数据Blob的名字
top: "conv1" // 该层的输出数据Blob的名字

// 该层的权值和偏置相关参数
param {
lr_mult: 1  //weight的学习率
}
param {
lr_mult: 2  // bias的学习率
}

// 该层（卷积层）的卷积运算相关的参数
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"  // weights初始化方法
}
bias_filler {
type: "constant" // bias初始化方法
}
}

}

注：在caffe的原始proto文件中，关于卷积层的参数ConvolutionPraram定义如下：

message ConvolutionParameter {
optional uint32 num_output = 1; // The number of outputs for the layer
optional bool bias_term = 2 [default = true]; // whether to have bias terms

// Pad, kernel size, and stride are all given as a single value for equal dimensions in all spatial dimensions, or once per spatial dimension.
repeated uint32 pad = 3; // The padding size; defaults to 0
repeated uint32 kernel_size = 4; // The kernel size
repeated uint32 stride = 6; // The stride; defaults to 1
// Factor used to dilate the kernel, (implicitly) zero-filling the resulting holes. (Kernel dilation is sometimes referred to by its use in the algorithme à trous from Holschneider et al. 1987.)
repeated uint32 dilation = 18; // The dilation; defaults to 1

// For 2D convolution only, the *_h and *_w versions may also be used to specify both spatial dimensions.
optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only)
optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only)
optional uint32 kernel_h = 11; // The kernel height (2D only)
optional uint32 kernel_w = 12; // The kernel width (2D only)
optional uint32 stride_h = 13; // The stride height (2D only)
optional uint32 stride_w = 14; // The stride width (2D only)

optional uint32 group = 5 [default = 1]; // The group size for group conv

optional FillerParameter weight_filler = 7; // The filler for the weight
optional FillerParameter bias_filler = 8; // The filler for the bias
enum Engine {
DEFAULT = 0;
CAFFE = 1;
CUDNN = 2;
}
optional Engine engine = 15 [default = DEFAULT];

// The axis to interpret as "channels" when performing convolution.
// Preceding dimensions are treated as independent inputs;
// succeeding dimensions are treated as "spatial".
// With (N, C, H, W) inputs, and axis == 1 (the default), we perform
// N independent 2D convolutions, sliding C-channel (or (C/g)-channels, for
// groups g>1) filters across the spatial axes (H, W) of the input.
// With (N, C, D, H, W) inputs, and axis == 1, we perform
// N independent 3D convolutions, sliding (C/g)-channels
// filters across the spatial axes (D, H, W) of the input.
optional int32 axis = 16 [default = 1];

// Whether to force use of the general ND convolution, even if a specific
// implementation for blobs of the appropriate number of spatial dimensions
// is available. (Currently, there is only a 2D-specific convolution
// implementation; for input blobs with num_axes != 2, this option is
// ignored and the ND implementation will be used.)
optional bool force_nd_im2col = 17 [default = false];
}

2. 卷积层相关参数

接下来，分别对卷积层的相关参数进行说明

（根据卷积层的定义，它的学习参数应该为filter的取值和bias的取值，其他的相关参数都为hyper-paramers，在定义模型时是要给出的）

lr_mult：学习率系数

放置在param{}中

该系数用来控制学习率，在进行训练过程中，该层参数以该系数乘solver.prototxt配置文件中的base_lr的值为学习率

即学习率=lr_mult*base_lr

如果该层在结构配置文件中有两个lr_mult，则第一个表示fitler的权值学习率系数，第二个表示偏执项的学习率系数（一般情况下，偏执项的学习率系数是权值学习率系数的两倍）

convolution_praram：卷积层的其他参数

放置在convoluytion_param{}中

该部分对卷积层的其他参数进行设置，有些参数为必须设置，有些参数为可选（因为可以直接使用默认值）

必须设置的参数

num_output：该卷积层的filter个数

kernel_size：卷积层的filter的大小（直接用该参数时，是filter的长宽相等，2D情况时，也可以设置为不能，此时，利用kernel_h和kernel_w两个参数设定）

其他可选的设置参数

stride：filter的步长，默认值为1。也可以通过stride_h和stride_w来单独设定。

pad：是否对输入的image进行padding，默认值为0，即不填充（注意，进行padding可能会带来一些无用信息，输入image较小时，似乎不太合适）。如果设置为2，则上下左右都分别扩充两个像素（在卷积核为5*5的情况下，图像大小不必拿），也可以通过pad_h和pad_w来单独设定。
weight_filter：权值初始化方法，使用方法如下

weight_filter{

type:"xavier" //这里的xavier是一冲初始化算法，也可以是“gaussian”；默认值为“constant”，即全部为0

}
bias_filter：偏执项初始化方法

bias_filter{

type:"xavier" //这里的xavier是一冲初始化算法，也可以是“gaussian”；默认值为“constant”，即全部为0

}

bias_term：是否使用偏执项，默认值为Ture

1. Pooling层总述

下面首先给出pooling层的结构设置的一个小例子（定义在.prototxt文件中）

layer {
name: "pool1"   //该层的名称
type: "Pooling"  //该层的类型
bottom: "norm1"  //该层的输入数据blob
top: "pool1"   //该层的输出数据blob

// 该层的相关参数设置
pooling_param {
pool: MAX  //pooling类型，默认值为MAX，也可以设置为AVE，STOCHASTIC
kernel_size: 3  //pooling核大小，为必设参数
stride: 2  //pooling核步长，默认值为1（即重叠），但通常设置为2；
}

}

注：在caffe的原始proto文件中，关于卷积层的参数PoolingParameter定义如下：

message PoolingParameter {
enum PoolMethod {
MAX = 0;
AVE = 1;
STOCHASTIC = 2;
}
optional PoolMethod pool = 1 [default = MAX]; // The pooling method
// Pad, kernel size, and stride are all given as a single value for equal
// dimensions in height and width or as Y, X pairs.
optional uint32 pad = 4 [default = 0]; // The padding size (equal in Y, X)
optional uint32 pad_h = 9 [default = 0]; // The padding height
optional uint32 pad_w = 10 [default = 0]; // The padding width
optional uint32 kernel_size = 2; // The kernel size (square)
optional uint32 kernel_h = 5; // The kernel height
optional uint32 kernel_w = 6; // The kernel width
optional uint32 stride = 3 [default = 1]; // The stride (equal in Y, X)
optional uint32 stride_h = 7; // The stride height
optional uint32 stride_w = 8; // The stride width
enum Engine {
DEFAULT = 0;
CAFFE = 1;
CUDNN = 2;
}
optional Engine engine = 11 [default = DEFAULT];
// If global_pooling then it will pool over the size of the bottom by doing
// kernel_h = bottom->height and kernel_w = bottom->width
optional bool global_pooling = 12 [default = false];
}

1. 激活函数层总述

下面首先给出激活函数层的结构设置的一个小例子（定义在.prototxt文件中）

layer {
name: "relu1"  //该层名称
type: "ReLU"   //激活函数类型
bottom: "conv1" //该层输入数据blob
top: "conv1"  //该层输出数据blob
}

注意：activation是一种element-wise的操作，所以，可以做in-place来节约内存，通过给bottom blob和top blon相同的名字来实验

2. 可选激活函数类型

type:"Sigmoid":f(x)=1/(1+e(-x))

type:"ReLu":f(x)=max(x,0)

type:"AbsVal":f(x)=abs(x)

type:"TanH":f(x)=[e(x)-e(-x)]/[e(x)+e(-x)]

type:"BNLL":f(x)= (shift + scale * x) ^ power

type:"Power":f(x)=log(1 + exp(x))

 参考：caffe tutorial

本篇主要介绍全连接层

该层是对元素进行wise to wise的运算

1. 全连接层总述

下面首先给出全连接层的结构设置的一个小例子（定义在.prototxt文件中）

layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}

2. 全连接层相关参数

接下来，分别对全连接层的相关参数进行说明

（根据全连接层层的定义，它的学习参数应该为权值和bias，其他的相关参数都为hyper-paramers，在定义模型时是要给出的）

注：全链接层其实也是一种卷积层，只不过卷积核大小与输入图像大小一致

lr_mult：学习率系数

放置在param{}中

该系数用来控制学习率，在进行训练过程中，该层参数以该系数乘solver.prototxt配置文件中的base_lr的值为学习率

即学习率=lr_mult*base_lr

如果该层在结构配置文件中有两个lr_mult，则第一个表示权值学习率系数，第二个表示偏执项的学习率系数（一般情况下，偏执项的学习率系数是权值学习率系数的两倍）

inner_product_param：内积层的其他参数

放置在inner_product_param{}中

该部分对内积层的其他参数进行设置，有些参数为必须设置，有些参数为可选（因为可以直接使用默认值）

必须设置的参数

num_output：filter个数

其他可选的设置参数

weight_filter：权值初始化方法，使用方法如下

weight_filter{
type:"xavier" //这里的xavier是一冲初始化算法，也可以是“gaussian”；默认值为“constant”，即全部为0
}

bias_filter：偏执项初始化方法

bias_filter{

type:"xavier" //这里的xavier是一冲初始化算法，也可以是“gaussian”；默认值为“constant”，即全部为0

}
bias_term：是否使用偏执项，默认值为Ture

 参考：caffe tutorial

本篇主要介绍loss层

1. loss层总述

下面首先给出全loss层的结构设置的一个小例子（定义在.prototxt文件中）

layer {
name: "loss"
type: "SoftmaxWithLoss"  //loss fucntion的类型
bottom: "pred"  //loss fucntion的输入数据blob，即网络的预测值lable
bottom: "label"  //loss function的另外一个输入数据blob，即数据集的真实label
top: "loss" //loss的输出blob，即分类器的loss 值
}

2. loss function类型

粗略地讲，loss function是用来衡量估计值和真实值之间的误差情况的；在caffe中，包含了常用的loss function，目前主要有以下几种：

【Loss drives learning by comparing an output to a target and assigning cost to minimize. The loss itself is computed by the forward pass and the gradient w.r.t. to the loss is computed by the backward pass.】

（1）softmax：图像多类分类问题中主要就是用它
Layer type:

SoftmaxWithLoss

（2）

Sum-of-Squares / Euclidean：主要用在线性回归中
Layer type:

EuclideanLoss

（3）

Hinge / Margin：主要用在SVM分类器中
Layer type:

HingeLoss

（4）Sigmoid Cross-Entropy
Layer type: SigmoidCrossEntropyLoss

（5）Infogain
Layer type: InfogainLoss

其他相关参考：http://blog.csdn.net/Sun7_She/article/category/3267005/1

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： caffe 神经网络

相关文章推荐

新的分享

章节导航