Tensorflow学习之卷积神经网络实现(四)
2017-09-01 14:44
429 查看
本次主要实现的是VGGNet,这个网络所有的卷积核大小都为3x3,最大池化层都用的2x2的大小,正是由于VGGNet的探索,发现小型的卷积核在效果比5x5,7x7等大卷积核效率(两个3x3的卷积层串联相当于一个5x5的卷积层,即一个像素会跟周围5x5的像素产生关联,但3x3的参数量更少,3x3x2<5x5,并且拥有更多的非线性变换,使得CNN对特征的学习能力更强)差不多的情况下,更有助于网络深度的提升,并且网络结构非常简洁,详细见
下面就实现一个VGG-16,构造出VGGNet网络结构,并评测其forward(inference)耗时和backward(training)耗时
结果如下:
由于运行时间实在太长了,,Mac CPU无奈啊。。看着别人跑的forward计算平均每个batch的耗时为0.152s,相比同样batch size的AlexNet的0.026s(如果无LRN层则是0.007s),backward求解梯度时,每个batch的平均耗时达到了0.617s,相比于AlexNet的0.078s也高了很多,竟无语凝噎。。。不过这说明了VGGNet-16的计算复杂度还是比AlexNet确实高了不少,不过根据比赛结果也能看到准确率带来了很大的提升。VGGNet的模型参数虽然比AlexNet多,但是反而只需要较少的迭代次数就可以收敛,主要原因是更深的网络和更小的卷积核带来的隐式的正则化效果。
下面就实现一个VGG-16,构造出VGGNet网络结构,并评测其forward(inference)耗时和backward(training)耗时
from datetime import datetime import math import time import tensorflow as tf def conv_op(input_op,name,kh,kw,n_out,dh,dw,p): n_in = input_op.get_shape()[-1].value with tf.name_scope(name) as scope: kernel = tf.get_variable(scope+"w",shape=[kh,kw,n_in,n_out],dtype=tf.float32,initializer=tf.contrib.layers.xavier_initializer_conv2d()) conv = tf.nn.conv2d(input_op,kernel,(1,dh,dw,1),padding='SAME') bias_init_val = tf.constant(0.0,shape=[n_out],dtype=tf.float32) biases = tf.Variable(bias_init_val,trainable=True,name='b') z = tf.nn.bias_add(conv,biases) activation = tf.nn.relu(z,name=scope) p += [kernel,biases] return activation def fc_op(input_op,name,n_out,p): n_in = input_op.get_shape()[-1].value with tf.name_scope(name) as scope: kernel = tf.get_variable(scope+"w",shape = [n_in,n_out],dtype = tf.float32,initializer = tf.contrib.layers.xavier_initializer()) biases = tf.Variable(tf.constant(0.1,shape=[n_out],dtype=tf.float32),name = 'b') activation = tf.nn.relu_layer(input_op,kernel,biases,name=scope) p += [kernel,biases] return activation def mpool_op(input_op,name,kh,kw,dh,dw): return tf.nn.max_pool(input_op,ksize=[1,kh,kw,1],strides=[1,dh,dw,1],padding='SAME',name=name) def inference_op(input_op,keep_prob): p = [] conv1_1 = conv_op(input_op,name="conv1_1",kh=3,kw=3,n_out=64,dh=1,dw=1,p=p) conv1_2 = conv_op(conv1_1,name="conv1_2",kh=3,kw=3,n_out=64,dh=1,dw=1,p=p) pool1 = mpool_op(conv1_2,name="pool1",kh=2,kw=2,dw=2,dh=2) conv2_1 = conv_op(pool1,name="conv2_1",kh=3,kw=3,n_out=128,dh=1,dw=1,p=p) conv2_2 = conv_op(conv2_1,name="conv2_2",kh=3,kw=3,n_out=128,dh=1,dw=1,p=p) pool2 = mpool_op(conv2_2,name="pool2",kh=2,kw=2,dw=2,dh=2) conv3_1 = conv_op(pool2, name="conv3_1", kh=3, kw=3, n_out=256, dh=1, dw=1, p=p) conv3_2 = conv_op(conv3_1, name="conv3_2", kh=3, kw=3, n_out=256, dh=1, dw=1, p=p) conv3_3 = conv_op(conv3_2, name="conv3_2", kh=3, kw=3, n_out=256, dh=1, dw=1, p=p) pool3 = mpool_op(conv3_3, name="pool3", kh=2, kw=2, dw=2, dh=2) conv4_1 = conv_op(pool3, name="conv4_1", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p) conv4_2 = conv_op(conv4_1, name="conv4_2", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p) conv4_3 = conv_op(conv4_2, name="conv4_3", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p) pool4 = mpool_op(conv4_3, name="pool4", kh=2, kw=2, dw=2, dh=2) conv5_1 = conv_op(pool4, name="conv5_1", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p) conv5_2 = conv_op(conv5_1, name="conv5_2", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p) conv5_3 = conv_op(conv5_2, name="conv5_3", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p) pool5 = mpool_op(conv5_3, name="pool5", kh=2, kw=2, dw=2, dh=2) shp = pool5.get_shape() flattened_shape = shp[1].value * shp[2].value * shp[3].value resh1 = tf.reshape(pool5,[-1,flattened_shape],name = "resh1") fc6 = fc_op(resh1,name="fc6",n_out=4096,p=p) fc6_drop = tf.nn.dropout(fc6,keep_prob,name="fc6_drop") fc7 = fc_op(fc6_drop,name = "fc7",n_out = 4096,p=p) fc7_drop = tf.nn.dropout(fc7,keep_prob,name="fc7_drop") fc8 = fc_op(fc7_drop,name="fc8",n_out=1000,p=p) softmax = tf.nn.softmax(fc8) predictions = tf.argmax(softmax,1) return predictions,softmax,fc8,p def time_tensorflow_run(session,target,feed,info_string): num_steps_burn_in = 10 total_duration =0.0 total_duration_squared = 0.0 for i in range(num_batches + num_steps_burn_in): start_time = time.time() _ = session.run(target,feed_dict=feed) duration = time.time()-start_time if i >= num_steps_burn_in: if not i % 10: print('%s:step %d,duration = %.3f'%(datetime.now(),i-num_steps_burn_in,duration)) total_duration += duration total_duration_squared +=duration *duration mn = total_duration / num_batches vr = total_duration_squared /num_batches -mn * mn sd = math.sqrt(vr) print('%s:%s across %d steps,%.3f +/- %.3f sec/batch' %(datetime.now(),info_string,num_batches,mn,sd)) def run_benchmark(): with tf.Graph().as_default(): image_size = 224 images = tf.Variable(tf.random_normal([batch_size,image_size,image_size,3],dtype=tf.float32,stddev=1e-1)) keep_prob = tf.placeholder(tf.float32) predictions,softmax,fc8,p = inference_op(images,keep_prob) init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) time_tensorflow_run(sess,predictions,{keep_prob:1.0},"Forward") objective = tf.nn.l2_loss(fc8) grad = tf.gradients(objective,p) time_tensorflow_run(sess,grad,{keep_prob:0.5},"Forward-backward") batch_size = 32 num_batches = 100 run_benchmark()
结果如下:
/usr/local/Cellar/anaconda/bin/python /Users/new/Documents/JLIFE/Tensorflow/training/mnist_train.py 2017-09-01 14:22:53.286299: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-09-01 14:22:53.286338: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-09-01 14:22:53.286345: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-09-01 14:22:53.286352: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 2017-09-01 14:30:53.298980:step 0,duration = 29.278 2017-09-01 14:36:11.613342:step 10,duration = 37.400 Process finished with exit code 137 (interrupted by signal 9: SIGKILL)
由于运行时间实在太长了,,Mac CPU无奈啊。。看着别人跑的forward计算平均每个batch的耗时为0.152s,相比同样batch size的AlexNet的0.026s(如果无LRN层则是0.007s),backward求解梯度时,每个batch的平均耗时达到了0.617s,相比于AlexNet的0.078s也高了很多,竟无语凝噎。。。不过这说明了VGGNet-16的计算复杂度还是比AlexNet确实高了不少,不过根据比赛结果也能看到准确率带来了很大的提升。VGGNet的模型参数虽然比AlexNet多,但是反而只需要较少的迭代次数就可以收敛,主要原因是更深的网络和更小的卷积核带来的隐式的正则化效果。
相关文章推荐
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- Tensorflow学习:简单实现卷积神经网络(CNN)
- Tensorflow深度学习之十:Tensorflow实现经典卷积神经网络AlexNet
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- tensorflow 学习专栏(六):使用卷积神经网络(CNN)在mnist数据集上实现分类
- Tensorflow学习之实现卷积神经网络(三)
- tensorflow 学习笔记9 卷积神经网络(CNN)实现mnist手写识别
- Tensorflow学习之实现卷积神经网络(一)
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例详细介绍
- TensorFlow深度学习进阶教程:TensorFlow实现CIFAR-10数据集测试的卷积神经网络
- Tensorflow学习之实现卷积神经网络(二)
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例详细介绍(转载)
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例
- 学习用tensorflow实现卷积神经网络中的卷积层随笔
- 深度学习之卷积神经网络CNN及tensorflow代码实现示例