您的位置:首页 > Web前端

caffe源码解析—caffe layer的工作原理理解

2017-08-21 19:54 471 查看
caffe是现在运用广泛的深度学习框架,最近也在阅读caffe源码,将layer的原理个人理解跟大家分享一下。

看完需要点耐心,分析的自认为比较清楚了,代码不多。

caffe要实现神经网络的前向以及反向传播计算需要两个要素:一个是数据,一个是算法。

先说数据:caffe定义了blob类,用来存储训练时的数据,在此不细讲,以后有机会再分享吧。

然后是算法:首先我们知道深度学习有很多种类型的层,比如卷积,pooling,RELU等。那么要实现这么多种层的算法该怎么实现。(本文是自底向上的结构)

caffe运用的正是c++面对对象的动态多态。首先定义了一个基类layer,在layer.hpp文件里面。

并且包含了如下的文件:

#include <algorithm>
#include <string>
#include <vector>

#include "caffe/blob.hpp"
#include "caffe/common.hpp"
#include "caffe/layer_factory.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/math_functions.hpp"


可以看到blob也被包含,which means这里就是算法部分。

layer类里面看一下有哪些成员(挑重要的):

inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);

inline void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom);


首先是两个函数:一个是前向,一个是后向。

上面是声明,下面看定义:

template <typename Dtype>
inline Dtype Layer<Dtype>::Forward(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// Lock during forward to ensure sequential forward
Lock();
Dtype loss = 0;
Reshape(bottom, top);
switch (Caffe::mode()) {
case Caffe::CPU:
Forward_cpu(bottom, top);
for (int top_id = 0; top_id < top.size(); ++top_id) {
if (!this->loss(top_id)) { continue; }
const int count = top[top_id]->count();
const Dtype* data = top[top_id]->cpu_data();
const Dtype* loss_weights = top[top_id]->cpu_diff();
loss += caffe_cpu_dot(count, data, loss_weights);
}
break;
case Caffe::GPU:
Forward_gpu(bottom, top);

template <typename Dtype>
inline void Layer<Dtype>::Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
switch (Caffe::mode()) {
case Caffe::CPU:
Backward_cpu(top, propagate_down, bottom);
break;
case Caffe::GPU:
Backward_gpu(top, propagate_down, bottom);
break;
default:
LOG(FATAL) << "Unknown caffe mode.";
}
}


可以看到,特分别调用了Forward_cpu以及Forward_gpu两个函数,同样,Backward也是调用了它的cpu以及gpu函数



那我们再看前向以及后向的cpu和gpu函数,它的声明仍然在layer.hpp中:

/** @brief Using the CPU device, compute the layer output. */
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
/**
* @brief Using the GPU device, compute the layer output.
*        Fall back to Forward_cpu() if unavailable.
*/
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// LOG(WARNING) << "Using CPU code as backup.";
return Forward_cpu(bottom, top);
}

/**
* @brief Using the CPU device, compute the gradients for any parameters and
*        for the bottom blobs if propagate_down is true.
*/
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) = 0;
/**
* @brief Using the GPU device, compute the gradie
4000
nts for any parameters and
*        for the bottom blobs if propagate_down is true.
*        Fall back to Backward_cpu() if unavailable.
*/
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
// LOG(WARNING) << "Using CPU code as backup.";
Backward_cpu(top, propagate_down, bottom);
}


看到virtual是不是就有感觉了并且是纯虚函数!!!!!!!!!!!!!!!!!!!!!!如果不懂virtual请移步c++面对对象好好看看。

这里其实就是实现不同层的前向以及反向传播的关键了。我们以卷积层为例来看看卷积层是怎么实现的。首先我们应该想到,动态多态的实现靠virtual以及继承。

 那么卷积层肯定是继承了layer的,不信请看(conv_layer.hpp以及base_conv_layer.hpp):

#include <vector>

#include "caffe/blob.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"
#include "caffe/util/im2col.hpp"

namespace caffe {

/**
* @brief Abstract base class that factors out the BLAS code common to
*        ConvolutionLayer and DeconvolutionLayer.
*/
template <typename Dtype>
class BaseConvolutionLayer : public Layer<Dtype> {


explicit ConvolutionLayer(const LayerParameter& param)
: BaseConvolutionLayer<Dtype>(param) {}


这里有点小插曲,就是卷积在实现的时候用了一种方法,请看博客(此处为引用):
http://blog.csdn.net/mounty_fsc/article/details/51290446
所以先定义了一个 BaseConvolutionLayer类,!!!!!!!!!!!!!继承了layer!!!!!!!!!!!!!!!

然后ConvolutionLayer类再继承BaseConvolutionLayer。

继承已经出现了,我们直接进入主题,要实现动态多态子类肯定要重新定义前面的Foward_cpu,gpu等函数。先上图,conv_layer.hpp文件中:

protected:
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
virtual inline bool reverse_dimensions() { return false; }
virtual void compute_output_shape();


先声明了四个函数,再看定义(conv_layer.cpp文件):

template <typename Dtype>
void ConvolutionLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
const Dtype* weight = this->blobs_[0]->cpu_data();
for (int i = 0; i < bottom.size(); ++i) {
const Dtype* bottom_data = bottom[i]->cpu_data();
Dtype* top_data = top[i]->mutable_cpu_data();
for (int n = 0; n < this->num_; ++n) {
this->forward_cpu_gemm(bottom_data + n * this->bottom_dim_, weight,
top_data + n * this->top_dim_);
if (this->bias_term_) {
const Dtype* bias = this->blobs_[1]->cpu_data();
this->forward_cpu_bias(top_data + n * this->top_dim_, bias);
}
}
}
}

template <typename Dtype>
void ConvolutionLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
const Dtype* weight = this->blobs_[0]->cpu_data();
Dtype* weight_diff = this->blobs_[0]->mutable_cpu_diff();
for (int i = 0; i < top.size(); ++i) {
const Dtype* top_diff = top[i]->cpu_diff();
const Dtype* bottom_data = bottom[i]->cpu_data();
Dtype* bottom_diff = bottom[i]->mutable_cpu_diff();
// Bias gradient, if necessary.
if (this->bias_term_ && this->param_propagate_down_[1]) {
Dtype* bias_diff = this->blobs_[1]->mutable_cpu_diff();
for (int n = 0; n < this->num_; ++n) {
this->backward_cpu_bias(bias_diff, top_diff + n * this->top_dim_);
}
}
if (this->param_propagate_down_[0] || propagate_down[i]) {
for (int n = 0; n < this->num_; ++n) {
// gradient w.r.t. weight. Note that we will accumulate diffs.
if (this->param_propagate_down_[0]) {
this->weight_cpu_gemm(bottom_data + n * this->bottom_dim_,
top_diff + n * this->top_dim_, weight_diff);
}
// gradient w.r.t. bottom data, if necessary.
if (propagate_down[i]) {
this->backward_cpu_gemm(top_diff + n * this->top_dim_, weight,
bottom_diff + n * this->bottom_dim_);
}
}
}
}
}


函数里面还调用了两个函数forward_cpu_gemm,backward_cpu_gemm。里面的具体实现细节就不关注了。在这里就可以看出,卷积层定义了自己forward_cpu,gpu以及

Backward_cpu,gpu函数这些函数就是每种类型的类实现前向以及反向的算法。同理,其他的类型的层也定义了属于自己的算法。到此为止,基础设施已经搭建完成了,要

实现动态多态还要通过调用才能实现(动态多态我没记错的话应该是在运行的时候决定调用哪个函数)。

那么我们来看看caffe 的test()函数(caffe.cpp文件):

int test() {
CHECK_GT(FLAGS_model.size(), 0) << "Need a model definition to score.";
CHECK_GT(FLAGS_weights.size(), 0) << "Need model weights to score.";

// Set device id and mode
vector<int> gpus;
get_gpus(&gpus);
if (gpus.size() != 0) {
LOG(INFO) << "Use GPU with device ID " << gpus[0];
#ifndef CPU_ONLY
cudaDeviceProp device_prop;
cudaGetDeviceProperties(&device_prop, gpus[0]);
LOG(INFO) << "GPU device name: " << device_prop.name;
#endif
Caffe::SetDevice(gpus[0]);
Caffe::set_mode(Caffe::GPU);
} else {
LOG(INFO) << "Use CPU.";
Caffe::set_mode(Caffe::CPU);
}
// Instantiate the caffe net.
Net<float> caffe_net(FLAGS_model, caffe::TEST);
caffe_net.CopyTrainedLayersFrom(FLAGS_weights);
LOG(INFO) << "Running for " << FLAGS_iterations << " iterations.";

vector<int> test_score_output_id;
vector<float> test_score;
float loss = 0;
for (int i = 0; i < FLAGS_iterations; ++i) {
float iter_loss;
const vector<Blob<float>*>& result =
caffe_net.Forward(&iter_loss);
loss += iter_loss;
int idx = 0;
for (int j = 0; j < result.size(); ++j) {
const float* result_vec = result[j]->cpu_data();
for (int k = 0; k < result[j]->count(); ++k, ++idx) {
const float score = result_vec[k];
if (i == 0) {
test_score.push_back(score);
test_score_output_id.push_back(j);
} else {
test_score[idx] += score;
}
const std::string& output_name = caffe_net.blob_names()[
caffe_net.output_blob_indices()[j]];
LOG(INFO) << "Batch " << i << ", " << output_name << " = " << score;
}
}
}
loss /= FLAGS_iterations;
LOG(INFO) << "Loss: " << loss;
for (int i = 0; i < test_score.size(); ++i) {
const std::string& output_name = caffe_net.blob_names()[
caffe_net.output_blob_indices()[test_score_output_id[i]]];
const float loss_weight = caffe_net.blob_loss_weights()[
caffe_net.output_blob_indices()[test_score_output_id[i]]];
std::ostringstream loss_msg_stream;
const float mean_score = test_score[i] / FLAGS_iterations;
if (loss_weight) {
loss_msg_stream << " (* " << loss_weight
<< " = " << loss_weight * mean_score << " loss)";
}
LOG(INFO) << output_name << " = " << mean_score << loss_msg_stream.str();
}

return 0;
}


看到代码中的Forwad函数就对了,调用它的对象是caffe_net是Net类。我们直接转到它的定义(net.cpp文件):

template <typename Dtype>
const vector<Blob<Dtype>*>& Net<Dtype>::Forward(Dtype* loss) {
if (loss != NULL) {
*loss = ForwardFromTo(0, layers_.size() - 1);
} else {
ForwardFromTo(0, layers_.size() - 1);
}
return net_output_blobs_;
}
调用了一个ForwardFromTo函数,直接看定义:

template <typename Dtype>
Dtype Net<Dtype>::ForwardFromTo(int start, int end) {
CHECK_GE(start, 0);
CHECK_LT(end, layers_.size());
Dtype loss = 0;
for (int i = start; i <= end; ++i) {
// LOG(ERROR) << "Forwarding " << layer_names_[i];
Dtype layer_loss = layers_[i]->Forward(bottom_vecs_[i], top_vecs_[i]);
loss += layer_loss;
if (debug_info_) { ForwardDebugInfo(i); }
}
return loss;
}
有没有一种似曾相识的感觉,再看一眼:



layer_[i]是什么,是不是很激动!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!layer

现在回想一下,一个深度神经网络Net有很多层组成的是不是,那么每一层的计算分别调用自己的算法函数不就很完美。通过前面的介绍,

对于某一层,首先调用Forward()函数,这个函数是在layer基类中定义的,他又会去掉用Forward_cpu等函数来具体实现,这个时候有分别

调用各自定义的函数,对于一个Net网络,每一层都会调用自己的算法,从而实现准确的前向以及反向计算。同理,train()函数的实现大家

可自行参考代码。

!!!!!!!!!!!!!!!!结尾了!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

看代码不易,先从.h文件着手,理解项目的整体框架结构,谁引用了谁,实现了什么,结构清晰就可以尝试自己去构造具体实现(大神的工作)。

写的挺糟糕的,需要有耐心去看。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: