您的位置：首页 > 其它

Tensorflow(r1.4)API--tf.nn.conv2d详解

2017-12-11 20:45 876 查看

(一)函数简介

conv2d(input,filter,strides,padding,use_cudnn_on=True,data_format='NHWC',name=None)

1.参数：

input

:一个half或者float32类型的tensor,shape为4-D,

[batches,filter_height,filter_width,in_channels]

,[训练时一组图片的数量，图片高度，图片宽度,图像通道数1]，

filter

: 相当于CNN中的卷积核，必须和

input

同类型，一个4-D的tensor,

[filter_height,filter_width,in_channels,out_channels]

in_channels

同

input

中的

in_channels

strides

：长度为4的1-D的tensor,卷积时在

input

tensor上每一维的步长。

padding
: 一个字符串，SAME
或者VALID
,卷积的类型，决定是否包含边,其值为SAME
时卷积核可以包含边，但会涉及图片的边界扩充的问题，后面解释。

use_cudnn_on_gpu

: 布尔值，决定是否使用GPU加速。

data_format

: 一个字符串，为“NHWC”或者“NCHW”，缺省值为“NHWC”，说明输入输出数据的格式。

name

: op的名字

(二)用法：

1.计算

根据指定的4-D

input

和

filter

tensors计算2-D的卷积,输入tensor的shape为,

[batch,in_height,in_width,in_channels]

,卷积核的shape为,

[filter_height,filter_width,in_channels,out_channels]

计算过程如下：

把filter转换成shape为

[filter_height*filter_width*inchannels,output_channels]

的2-D的矩阵

将输入tensor中的图像组转码成virtual tensor,此tensor的shape为

[batch,out_height,out_width,filter_height*filter_width*in_channels]

.

将上述两矩阵相乘，virtual tensor*2-D filter.

返回值是一个4-D的tensor.

2.图像卷积

(1)概述

卷积操作使用2-D滤波器在一组图像上进行扫描，将滤波器应用到每个适当大小的图像的窗口上3。不同的卷积操作在泛型和特殊的滤波器中进行选择:

-

conv2d

:可以混合通道的任意滤波器。

-

depthwise_conv2d

:分别在每个通道上独立进行卷积的滤波器。

-

separable_conv2d

:深度空间滤波器，配合逐点滤波器使用。

(2)padding参数与输出feature map的大小

先不考虑通道数，假设4-D的

input

tensor的shape为

[batch,in_height,in_width,...]

,4-D的filter的shape为

[filter_height,filter_width,...]

,卷积操作的空间特性取决图像的扩充方案，也就是

padding

参数的取值是

SAME

还是

VALID

,图像扩充的值都是0.

— padding=SAME

关于此的详细资料见2.这里只简单总结一下这种扩充方案的机制。

padding

的值为

SAME

时，输出特征图的宽和高计算方式如下：

out_height = cell(float(in_height)/ float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))

在输入图像的宽度和高度方向的整体扩充方式如下：

if (in_height % strides[1] == 0):
pad_along_height = max(filter_height - strides[1], 0)
else:
pad_along_height = max(filter_height - (in_height % strides[1]), 0)
if (in_width % strides[2] == 0):
pad_along_width = max(filter_width - strides[2], 0)
else:
pad_along_width = max(filter_width - (in_width % strides[2]), 0)

写成表达式为：

pi={max(k−s,0) , if(ni mod s==0)max(k−(nimod s),0), if( ni mod s ≠ 0)

pi:需要扩充的像素个数

k:卷积核的大小

s:卷积核移动的步长

ni:输入的图像大小

最后在输入图像上下左右四个方向上要扩充的大小为：

pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

可能会有

pad_along_height

和

pad_along_width

为奇数的情况，当出现这种情况时，习惯于把多余的那个扩充像素加到图像的下边和右边。例如，如果

pad_along_height=5

，就是上面扩充2个像素，下面扩充3个像素。

— padding=VALID

图像不进行扩充，输出的

feature map

的大小计算方式如下：

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

Reference

1 http://blog.csdn.net/farmwang/article/details/48102915

2 https://www.tensorflow.org/api_guides/python/nn#Notes_on_SAME_Convolution_Padding

3 https://www.tensorflow.org/api_guides/python/nn#Convolution

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航