Caffe源码解析4: Data_layer
2016-12-31 21:33
471 查看
转载自:http://home.cnblogs.com/louyihang-loves-baiyan/
data_layer应该是网络的最底层,主要是将数据送给blob进入到net中,在data_layer中存在多个跟data_layer相关的类
BaseDataLayer
BasePrefetchingDataLayer
DataLayer
DummyDataLayer
HDF5DataLayer
HDF5OutputLayer
ImageDataLayer
MemoryDataLayer
WindowDataLayer
Batch
这里首先说明一下这几个类之间的区别。
首先Layer是基类,这个之前就已经提到过了。DataLayer继承于BasePrefetchingDataLayer,BasePrefetchingDataLayer继承于BaseDataLayer,BaseDataLayer是Layer的子类(见上图),其次看HDF5相关的类有两个,一个是HDF5DataLayer,另一个是HDF5OutputLayer,主要是基于HDF5数据格式的读取和存储
留意到这个data_layer的头文件还include了不少头文件
hdf5就是之前说到的一种主要用于科学数据记录、能自我描述的数据格式。
还有几个跟data相关的头文件比如data_read.hpp,data_transformer.hpp
其中data_reader主要是负责数据的读取,传送到data layer中。并且对于每一个source,都会一一开起独立的reading thread读取线程,有几十多个solver在并行的跑。比如在多GPU训练的时候,可以保证对于数据库的读取是顺序的
data_transformer.hpp里面的DataTransformer这个类,主要能对input data 执行一些预处理操作,比如缩放、镜像、减去均值。同时还支持一些随机的操作。
其核心的函数如下,这里总共有5个常用的Transform函数,其中所有函数的第二部分是相同的,都是一个目标blob,而输入根据输入的情况可以有所选择,可以是blob,也可以是opencv的mat 结构,或者proto中定义的datum结构。
TransformationParameter是该类构造器中需要传入的一些变形参数,相关的操作定义在proto中,摘录如下,可以看到总共有scale,mirror,crop_size,mean_file,mean_value,force_color,force_grey共7个相关操作
首先对于data_layer,里面根据继承关系最后的几个子类分别是ImageDataLayer,DataLayer,WindowDataLayer,MemoryDataLayer,HDF5以及Dummy这里暂时先不做分析。
其实最重要的就是类面的layerSetup.首先我们来看DataLayer的DataLayerSetUp
MemoryDataLayer
ImageDataLayer,它的DataLayerSetUp函数
WindowDataLayer的DataLayerSetUp,这个函数标比较长,我只列出了其中主要的部分,之前的Image相当于是已经剪裁过的一个图像,也就是说你的目标基本上是充满了整个画面,而Window File是用于原始图的,也就是说有background和object,这个window file 的格式如下
[code= text; auto-links: true; collapse: false; first-line: 1; gutter: html-script: light: ruler: smart-tabs: tab-size: 4; toolbar:">window_file format
repeated:
# image_index
img_path (abs path)
channels
height
width
num_windows
class_index overlap x1 y1 x2 y2
data_layer应该是网络的最底层,主要是将数据送给blob进入到net中,在data_layer中存在多个跟data_layer相关的类
BaseDataLayer
BasePrefetchingDataLayer
DataLayer
DummyDataLayer
HDF5DataLayer
HDF5OutputLayer
ImageDataLayer
MemoryDataLayer
WindowDataLayer
Batch
这里首先说明一下这几个类之间的区别。
首先Layer是基类,这个之前就已经提到过了。DataLayer继承于BasePrefetchingDataLayer,BasePrefetchingDataLayer继承于BaseDataLayer,BaseDataLayer是Layer的子类(见上图),其次看HDF5相关的类有两个,一个是HDF5DataLayer,另一个是HDF5OutputLayer,主要是基于HDF5数据格式的读取和存储
留意到这个data_layer的头文件还include了不少头文件
#include <string> #include <utility> #include <vector> #include "hdf5.h" #include "caffe/blob.hpp" #include "caffe/common.hpp" #include "caffe/data_reader.hpp" #include "caffe/data_transformer.hpp" #include "caffe/filler.hpp" #include "caffe/internal_thread.hpp" #include "caffe/layer.hpp" #include "caffe/proto/caffe.pb.h" #include "caffe/util/blocking_queue.hpp" #include "caffe/util/db.hpp"
hdf5就是之前说到的一种主要用于科学数据记录、能自我描述的数据格式。
还有几个跟data相关的头文件比如data_read.hpp,data_transformer.hpp
其中data_reader主要是负责数据的读取,传送到data layer中。并且对于每一个source,都会一一开起独立的reading thread读取线程,有几十多个solver在并行的跑。比如在多GPU训练的时候,可以保证对于数据库的读取是顺序的
data_transformer.hpp里面的DataTransformer这个类,主要能对input data 执行一些预处理操作,比如缩放、镜像、减去均值。同时还支持一些随机的操作。
其核心的函数如下,这里总共有5个常用的Transform函数,其中所有函数的第二部分是相同的,都是一个目标blob,而输入根据输入的情况可以有所选择,可以是blob,也可以是opencv的mat 结构,或者proto中定义的datum结构。
void Transform(const Datum& datum, Blob<Dtype>* transformed_blob); void Transform(const vector<Datum> & datum_vector, Blob<Dtype>* transformed_blob); void Transform(const vector<cv::Mat> & mat_vector, Blob<Dtype>* transformed_blob); void Transform(const cv::Mat& cv_img, Blob<Dtype>* transformed_blob); void Transform(Blob<Dtype>* input_blob, Blob<Dtype>* transformed_blob);
TransformationParameter是该类构造器中需要传入的一些变形参数,相关的操作定义在proto中,摘录如下,可以看到总共有scale,mirror,crop_size,mean_file,mean_value,force_color,force_grey共7个相关操作
message TransformationParameter { optional float scale = 1 [default = 1]; optional bool mirror = 2 [default = false]; optional uint32 crop_size = 3 [default = 0]; optional string mean_file = 4; repeated float mean_value = 5; optional bool force_color = 6 [default = false]; optional bool force_gray = 7 [default = false]; }
首先对于data_layer,里面根据继承关系最后的几个子类分别是ImageDataLayer,DataLayer,WindowDataLayer,MemoryDataLayer,HDF5以及Dummy这里暂时先不做分析。
其实最重要的就是类面的layerSetup.首先我们来看DataLayer的DataLayerSetUp
void DataLayer<Dtype>::DataLayerSetUp(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) { const int batch_size = this->layer_param_.data_param().batch_size(); //获得相应的datum,用来初始化top blob Datum& datum = *(reader_.full().peek()); //使用data_transformer 来计算根据datum的期望blob的shape vector<int> top_shape = this->data_transformer_->InferBlobShape(datum); this->transformed_data_.Reshape(top_shape); //首先reshape top[0],再根据batch的大小进行预取 top_shape[0] = batch_size; top[0]->Reshape(top_shape); for (int i = 0; i < this->PREFETCH_COUNT; ++i) { this->prefetch_[i].data_.Reshape(top_shape); } LOG(INFO) << "output data size: " << top[0]->num() << "," << top[0]->channels() << "," << top[0]->height() << "," << top[0]->width(); // 同样reshape label的blob的shape if (this->output_labels_) { vector<int> label_shape(1, batch_size); top[1]->Reshape(label_shape); for (int i = 0; i < this->PREFETCH_COUNT; ++i) { this->prefetch_[i].label_.Reshape(label_shape); } } }
MemoryDataLayer
void MemoryDataLayer<Dtype>::DataLayerSetUp(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) { //直接通过memory_data_param类设置layer的相关参数 batch_size_ = this->layer_param_.memory_data_param().batch_size(); channels_ = this->layer_param_.memory_data_param().channels(); height_ = this->layer_param_.memory_data_param().height(); width_ = this->layer_param_.memory_data_param().width(); size_ = channels_ * height_ * width_; CHECK_GT(batch_size_ * size_, 0) << "batch_size, channels, height, and width must be specified and" " positive in memory_data_param"; //这里跟datalayer一样都是先设置top[0],然后对label进行reshape vector<int> label_shape(1, batch_size_); top[0]->Reshape(batch_size_, channels_, height_, width_); top[1]->Reshape(label_shape); added_data_.Reshape(batch_size_, channels_, height_, width_); added_label_.Reshape(label_shape); data_ = NULL; labels_ = NULL; added_data_.cpu_data(); added_label_.cpu_data(); }
ImageDataLayer,它的DataLayerSetUp函数
void ImageDataLayer<Dtype>::DataLayerSetUp(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) { const int new_height = this->layer_param_.image_data_param().new_height(); const int new_width = this->layer_param_.image_data_param().new_width(); const bool is_color = this->layer_param_.image_data_param().is_color(); string root_folder = this->layer_param_.image_data_param().root_folder(); CHECK((new_height == 0 && new_width == 0) || (new_height > 0 && new_width > 0)) << "Current implementation requires " "new_height and new_width to be set at the same time."; //读取图像文件和相应的label const string& source = this->layer_param_.image_data_param().source(); LOG(INFO) << "Opening file " << source; std::ifstream infile(source.c_str()); string filename; int label; while (infile >> filename >> label) { lines_.push_back(std::make_pair(filename, label)); } if (this->layer_param_.image_data_param().shuffle()) { // randomly shuffle data LOG(INFO) << "Shuffling data"; const unsigned int prefetch_rng_seed = caffe_rng_rand(); prefetch_rng_.reset(new Caffe::RNG(prefetch_rng_seed)); ShuffleImages(); } LOG(INFO) << "A total of " << lines_.size() << " images."; lines_id_ = 0; //check是否需要随机跳过一些图像 if (this->layer_param_.image_data_param().rand_skip()) { unsigned int skip = caffe_rng_rand() % this->layer_param_.image_data_param().rand_skip(); LOG(INFO) << "Skipping first " << skip << " data points."; CHECK_GT(lines_.size(), skip) << "Not enough points to skip"; lines_id_ = skip; } //使用Opencv来读进图像,然后使用它初始化相应的top blob cv::Mat cv_img = ReadImageToCVMat(root_folder + lines_[lines_id_].first, new_height, new_width, is_color); CHECK(cv_img.data) << "Could not load " << lines_[lines_id_].first; //这里的步骤和上面相同,使用transformer来做reshape vector<int> top_shape = this->data_transformer_->InferBlobShape(cv_img); this->transformed_data_.Reshape(top_shape); //之后部分跟前面差不多,初始化top[0] const int batch_size = this->layer_param_.image_data_param().batch_size(); CHECK_GT(batch_size, 0) << "Positive batch size required"; top_shape[0] = batch_size; for (int i = 0; i < this->PREFETCH_COUNT; ++i) { this->prefetch_[i].data_.Reshape(top_shape); } top[0]->Reshape(top_shape); LOG(INFO) << "output data size: " << top[0]->num() << "," << top[0]->channels() << "," << top[0]->height() << "," << top[0]->width(); //reshape label vector<int> label_shape(1, batch_size); top[1]->Reshape(label_shape); for (int i = 0; i < this->PREFETCH_COUNT; ++i) { this->prefetch_[i].label_.Reshape(label_shape); } }
WindowDataLayer的DataLayerSetUp,这个函数标比较长,我只列出了其中主要的部分,之前的Image相当于是已经剪裁过的一个图像,也就是说你的目标基本上是充满了整个画面,而Window File是用于原始图的,也就是说有background和object,这个window file 的格式如下
[code= text; auto-links: true; collapse: false; first-line: 1; gutter: html-script: light: ruler: smart-tabs: tab-size: 4; toolbar:">window_file format
repeated:
# image_index
img_path (abs path)
channels
height
width
num_windows
class_index overlap x1 y1 x2 y2
相关文章推荐
- Caffe源码解析4: Data_layer
- Caffe源码解析3: Data_layer
- Caffe源码解析4: Data_layer
- Caffe源码解析4: Data_layer
- Caffe源码解析4: Data_layer
- Caffe源码解析4: Data_layer
- 代码笔记:caffe-reid中reid_data_layer源码解析
- Caffe源码解析7:Pooling_Layer
- caffe源码解析之添加新的Layer(maxout)
- Caffe源码解析5:Conv_Layer
- caffe源码分析--data_layer.cpp
- Caffe源码解析3:Layer
- Caffe源码解析5:Conv_Layer
- Caffe源码解析7:Pooling_Layer
- Caffe源码解析6:Neuron_Layer
- Caffe源码解析6:Neuron_Layer
- Caffe源码解析3:Layer
- caffe源码解析-BaseConvolutionLayer
- Caffe_Layer源码解析
- caffe源码解析-inner_product_layer