caffe之python接口实战 :net_surgery 官方教程源码解析
2017-12-03 10:58
736 查看
本文是官方文档的源码解析笔记系列之一
注1:本文内容属于caffe_root/example/下的ipynb文件的源码解析,旨在通过源码注释,加速初学者的学习进程。注2:以下解析中,未对各部分英文注释做翻译,旨在告诫初学者,应该去适应原汁原味的英文教程阅读,这样有助于提升自己阅读技术文献的能力,也是高级程序员的必备素养。
注3:建议大家在jupyter nootebook环境下结合源码注释,运行程序。
Net Surgery
Caffe networks can be transformed to your particular needs by editing the model parameters. The data, diffs, and parameters of a net are all exposed in pycaffe.Roll up your sleeves for net surgery with pycaffe!
import numpy as np import matplotlib.pyplot as plt %matplotlib inline # Make sure that caffe is on the python path: caffe_root = '../' # this file is expected to be in {caffe_root}/examples import sys sys.path.insert(0, caffe_root + 'python') import caffe # configure plotting plt.rcParams['figure.figsize'] = (10, 10) plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray'
Designer Filters
To show how to load, manipulate, and save parameters we’ll design our own filters into a simple network that’s only a single convolution layer. This net has two blobs,datafor the input and
convfor the convolution output and one parameter
convfor the convolution filter weights and biases.
# Load the net, list its data and params, and filter an example image. caffe.set_mode_cpu() net = caffe.Net('net_surgery/conv.prototxt', caffe.TEST) #没有装入训练好的权重文件,装入的卷积层的滤波器的卷积核数为3,即numout=3 print("blobs {}\nparams {}".format(net.blobs.keys(), net.params.keys())) # load image and prepare as a single input batch for Caffe im = np.array(caffe.io.load_image('images/cat_gray.jpg', color=False)).squeeze() #读入的是RGB,HWC,uint形式的图像,在此为一个通道的图像,并移除长度为1的轴 plt.title("original image") plt.imshow(im) plt.axis('off') im_input = im[np.newaxis, np.newaxis, :, :] net.blobs['data'].reshape(*im_input.shape) #reshap Blobs维度为四维 net.blobs['data'].data[...] = im_input
The convolution weights are initialized from Gaussian noise while the biases are initialized to zero. These random filters give output somewhat like edge detections.
# helper show filter outputs def show_filters(net): net.forward() #使网络参数初始化,装入图像,并被高斯初始化的滤波器处理,装入的卷积层的滤波器的卷积核数为3,即numout=3 plt.figure() filt_min, filt_max = net.blobs['conv'].data.min(), net.blobs['conv'].data.max(), for i in range(3): #装入的卷积层的滤波器的卷积核数为3,即numout=3,即卷积后有三通道的图像输出 plt.subplot(1,4,i+2) plt.title("filter #{} output".format(i)) #cmap设置为了gray,因为就是一个通道,一个通道的图像显示,所以不用考虑轴的转置和通道的后置 plt.imshow(net.blobs['conv'].data[0, i], vmin=filt_min, vmax=filt_max) #batchsize的第0张图像,第i通道图像卷积后的输出,vmin,vmax是为了colormap的归一化 plt.tight_layout() #防止各子图重叠,使之紧凑 plt.axis('off') # filter the image with initial show_filters(net)
Raising the bias of a filter will correspondingly raise its output:
# pick first filter output conv0 = net.blobs['conv'].data[0, 0] #batchsize的第0个图像,因为通道就一个,所以后面的0代表第0通道 print("pre-surgery output mean {:.2f}".format(conv0.mean())) # set first filter bias to 1 net.params['conv'][1].data[0] = 1. #第一个1代表时偏置参数,为0时是权重参数。0代表第1个滤波器偏置设为1 net.forward() print("post-surgery output mean {:.2f}".format(conv0.mean()))
Altering the filter weights is more exciting since we can assign any kernel like Gaussian blur, the Sobel operator for edges, and so on. The following surgery turns the 0th filter into a Gaussian blur and the 1st and 2nd filters into the horizontal and vertical gradient parts of the Sobel operator.
See how the 0th output is blurred, the 1st picks up horizontal edges, and the 2nd picks up vertical edges.
ksize = net.params['conv'][0].data.shape[2:] #shape0为几个numout即几个filters,shape1为输入的blobs的通道数,shape2及之后为filters的形状大小 # make Gaussian blur sigma = 1. y, x = np.mgrid[-ksize[0]//2 + 1:ksize[0]//2 + 1, -ksize[1]//2 + 1:ksize[1]//2 + 1] g = np.exp(-((x**2 + y**2)/(2.0*sigma**2))) gaussian = (g / g.sum()).astype(np.float32) #归一化后的二维标准高斯模板矩阵,sigma=1即标准差差为1, net.params['conv'][0].data[0] = gaussian # make Sobel operator for edge detection net.params['conv'][0].data[1:] = 0. sobel = np.array((-1, -2, -1, 0, 0, 0, 1, 2, 1), dtype=np.float32).reshape((3,3)) net.params['conv'][0].data[1, 0, 1:-1, 1:-1] = sobel # horizontal,其中的data[1, 0, 1:-1, 1:-1]的0代表输入数据为单通道的图像,取5*5核的中间区域附上sobel模板 net.params['conv'][0].data[2, 0, 1:-1, 1:-1] = sobel.T # vertical show_filters(net)
With net surgery, parameters can be transplanted across nets, regularized by custom per-parameter operations, and transformed according to your schemes.
Casting a Classifier into a Fully Convolutional Network
将caffenet的全连接层都改为卷积层,得到全卷积层的卷积网络caffenet,这样可以得到属于某一分类的置信度响应map!!
Let’s take the standard Caffe Reference ImageNet model “CaffeNet” and transform it into a fully convolutional net for efficient, dense inference on large inputs. This model generates a classification map that covers a given input size instead of a single classification. In particular a 8 × 8 classification map on a 451 × 451 input gives 64x the output in only 3x the time. The computation exploits a natural efficiency of convolutional network (convnet) structure by amortizing the computation of overlapping receptive fields.To do so we translate the
InnerProductmatrix multiplication layers of CaffeNet into
Convolutionallayers. This is the only change: the other layer types are agnostic to spatial size. Convolution is translation-invariant, activations are elementwise operations, and so on. The
fc6inner product when carried out as convolution by
fc6-convturns into a 6 × 6 filter with stride 1 on
pool5. Back in image space this gives a classification for each 227 × 227 box with stride 32 in pixels. Remember the equation for output map / receptive field size, output = (input - kernel_size) / stride + 1, and work out the indexing details for a clear understanding.
!diff net_surgery/bvlc_caffenet_full_conv.prototxt ../models/bvlc_reference_caffenet/deploy.prototxt #data层为 dim: 1 dim: 3 dim: 451 dim: 451
The only differences needed in the architecture are to change the fully connected classifier inner product layers into convolutional layers with the right filter size – 6 x 6, since the reference model classifiers take the 36 elements of
pool5as input – and stride 1 for dense classification. Note that the layers are renamed so that Caffe does not try to blindly load the old parameters when it maps layer names to the pretrained model.
# Load the original network and extract the fully connected layers' parameters. #含有全连接层的caffenet原模型, net = caffe.Net('../models/bvlc_reference_caffenet/deploy.prototxt', '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel', caffe.TEST) params = ['fc6', 'fc7', 'fc8'] # fc_params = {name: (weights, biases)} fc_params = {pr: (net.params[pr][0].data, net.params[pr][1].data) for pr in params} for fc in params: print '{} weights are {} dimensional and biases are {} dimensional'.format(fc, fc_params[fc][0].shape, fc_params[fc][1].shape)
Consider the shapes of the inner product parameters. The weight dimensions are the output and input sizes while the bias dimension is the output size.
# Load the fully convolutional network to transplant the parameters. #将caffenet的全连接层都改为卷积层,得到全卷积层的卷积网络caffenet,data层为 dim: 1 dim: 3 dim: 451 dim: 451,output = (input - kernel_size) / stride + 1, net_full_conv = caffe.Net('net_surgery/bvlc_caffenet_full_conv.prototxt', '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel', caffe.TEST) params_full_conv = ['fc6-conv', 'fc7-conv', 'fc8-conv'] # conv_params = {name: (weights, biases)} conv_params = {pr: (net_full_conv.params[pr][0].data, net_full_conv.params[pr][1].data) for pr in params_full_conv} for conv in params_full_conv: print '{} weights are {} dimensional and biases are {} dimensional'.format(conv, conv_params[conv][0].shape, conv_params[conv][1].shape)
The convolution weights are arranged in output × input × height × width dimensions. To map the inner product weights to convolution filters, we could roll the flat inner product vectors into channel × height × width filter matrices, but actually these are identical in memory (as row major arrays) so we can assign them directly.
The biases are identical to those of the inner product.
Let’s transplant!
for pr, pr_conv in zip(params, params_full_conv): conv_params[pr_conv][0].flat = fc_params[pr][0].flat # flat unrolls the arrays展平赋值不改变内存中形式,把全连接的caffenet的权重参数复制给全卷积caffenet权重变量 conv_params[pr_conv][1][...] = fc_params[pr][1]
Next, save the new model weights.
net_full_conv.save('net_surgery/bvlc_caffenet_full_conv.caffemodel')
To conclude, let’s make a classification map from the example cat image and visualize the confidence of “tiger cat” as a probability heatmap. This gives an 8-by-8 prediction on overlapping regions of the 451 × 451 input.
import numpy as np import matplotlib.pyplot as plt %matplotlib inline # load input and configure preprocessing im = caffe.io.load_image('images/cat.jpg') transformer = caffe.io.Transformer({'data': net_full_conv.blobs['data'].data.shape}) #转变im到net_full_conv网络sh适用的数据形式,data层为 dim: 1 dim: 3 dim: 451 dim: 451 transformer.set_mean('data', np.load('../python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1)) transformer.set_transpose('data', (2,0,1)) #HWC到CHW transformer.set_channel_swap('data', (2,1,0)) #RGB到BGR transformer.set_raw_scale('data', 255.0) #尺度化到0-255 # make classification map by forward and print prediction indices at each location #将caffenet的全连接层都改为卷积层,得到全卷积层的卷积网络caffenet,data层为 dim: 1 dim: 3 dim: 451 dim: 451,output = (input - kernel_size) / stride + 1, out = net_full_conv.forward_all(data=np.asarray([transformer.preprocess('data', im)])) #data为1*3*H*W形式 print out['prob'][0].argmax(axis=0) #softmax输出为batchsize*1000*8*8,0代表batchsize的第0张图片,axis=0表示8*8的map中个位置的值都是1000通道的概率的最大值 # show net input and confidence map (probability of the top prediction at each location) plt.subplot(1, 2, 1) plt.imshow(transformer.deprocess('data', net_full_conv.blobs['data'].data[0])) #输出data层的第0张原图 plt.subplot(1, 2, 2) plt.imshow(out['prob'][0,281]) #第0张图片的第281类的响应8*8map如下:
The classifications include various cats – 282 = tiger cat, 281 = tabby, 283 = persian – and foxes and other mammals.
In this way the fully connected layers can be extracted as dense features across an image (see net_full_conv.blobs['fc6'].data
for instance), which is perhaps more useful than the classification map itself.
Note that this model isn’t totally appropriate for sliding-window detection since it was trained for whole-image classification. Nevertheless it can work just fine. Sliding-window training and finetuning can be done by defining a sliding-window ground truth and loss such that a loss map is made for every location and solving as usual. (This is an exercise for the reader.)A thank you to Rowland Depp for first suggesting this trick.
相关文章推荐
- caffe之python接口实战 :01-learning-lenet 官方教程源码解析
- caffe之python接口实战 :02-fine-tuning 官方教程源码解析
- caffe之python接口实战 :brewing-logreg 官方教程源码解析
- caffe之python接口实战 :detection 官方教程源码解析
- caffe之python接口实战 :mnist_siamese 官方教程源码解析
- caffe之python接口实战 :pascal-multilabel-with-datalayer 官方教程源码解析
- caffe之python接口实战 :00-classification 官方教程源码解析
- 【caffe】Caffe的Python接口-官方教程-00-classification-详细说明(含代码)
- 【caffe】Caffe的Python接口-官方教程-01-learning-Lenet-详细说明(含代码)
- 云星数据---Scala实战系列(精品版)】:Scala入门教程027-Scala实战源码-Scala 的特质 (接口)03
- win10下caffe快速配置(包括PythonCaffe)+Caffe官方教程中译本及caffe网络模型各层详解教程
- 云星数据---Scala实战系列(精品版)】:Scala入门教程028-Scala实战源码-Scala 的特质 (接口)04
- 【神经网络与深度学习】【python开发】caffe-windows使能python接口使用draw_net.py绘制网络结构图过程
- Caffe for Python 官方教程(翻译)
- windows环境下caffe编译以及python接口配置教程(超详细)
- 【caffe-Windows】微软官方caffe之 Python接口配置及图片生成实例
- Caffe for Python 官方教程(翻译)
- 云星数据---Scala实战系列(精品版)】:Scala入门教程029-Scala实战源码-Scala 的特质 (接口)05
- C#/ASP.NET MVC微信公众号接口开发之从零开发(二) 接收微信消息并且解析XML(附源码)
- Asp.net core 项目实战 新闻网站+后台 源码、设计原理 、视频教程