TensorFlow入门-MNIST & CNN
2017-03-10 17:26
281 查看
参考TensorFLow官方教程Deep MNIST for Experts实现用CNN识别手写数字,数据集MNIST。
训练用时大约12min,结果如下
…
step 19300, train accuracy 0.94
step 19400, train accuracy 0.92
step 19500, train accuracy 0.86
step 19600, train accuracy 0.98
step 19700, train accuracy 0.94
step 19800, train accuracy 0.96
step 19900, train accuracy 0.94
然后,报错了。
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,28,28,32]
意思是用10000张图像测试时,内存泄露(memory leak)。不知道原因,那就把测试图形改成2000张试试。
输出:test accuracy 0.9155。
并不能达到官网所说99.2%的精度。
参考http://blog.csdn.net/yhl_leo/article/details/50624471,改变优化方法,将
step 19200, train accuracy 1
step 19300, train accuracy 1
step 19400, train accuracy 1
step 19500, train accuracy 1
step 19600, train accuracy 1
step 19700, train accuracy 1
step 19800, train accuracy 1
step 19900, train accuracy 1
test accuracy 0.9895
精度98.95%,测试样本2000个。
补充:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,28,28,32]
是因为在测试时GPU显存不足导致的,可以将test set分成几个batch分别测试,最后求平均精度。如下:
参考:
1. http://stackoverflow.com/questions/39076388/tensorflow-deep-mnist-resource-exhausted-oom-when-allocating-tensor-with-shape
2. https://github.com/tensorflow/tensorflow/pull/157
补充2
MNIST的训练图像一共有5,5000张,但是训练时使用了20,000个周期(epochs),每个周期的batch size是50,一共就有20000*50=100万张图片参与训练,平均每张图片训练了18次,那么这样的训练方式会提高精度吗,会不会产生过拟合呢?
把周期数由20,000改为2,000。测试5000张,精度为97.8%。
把周期数由20,000改为10,000。测试5000张,精度为98.76%。
把周期数由20,000改为25,000。测试5000张,精度为98.78%。
这三个实验结果的测试精度都比周期为20,000的精度低。
# load MNIST data from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("Mnist_data/", one_hot=True) # start tensorflow interactiveSession import tensorflow as tf sess = tf.InteractiveSession() # weight initialization def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape = shape) return tf.Variable(initial) # convolution def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') # pooling def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') # Create the model # placeholder x = tf.placeholder("float", [None, 784]) y_ = tf.placeholder("float", [None, 10]) # variables W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) y = tf.nn.softmax(tf.matmul(x,W) + b) # first convolutinal layer w_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) x_image = tf.reshape(x, [-1, 28, 28, 1]) h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) # second convolutional layer w_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) # densely connected layer w_fc1 = weight_variable([7*7*64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) # dropout keep_prob = tf.placeholder("float") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) # readout layer w_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc2) + b_fc2) # train and evaluate the model cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) train_step = tf.train.AdagradOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) sess.run(tf.initialize_all_variables()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_:batch[1], keep_prob:1.0}) print "step %d, train accuracy %g" %(i, train_accuracy) train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob:0.5}) print "test accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images, y_:mnist.test.labels, keep_prob:1.0})
训练用时大约12min,结果如下
…
step 19300, train accuracy 0.94
step 19400, train accuracy 0.92
step 19500, train accuracy 0.86
step 19600, train accuracy 0.98
step 19700, train accuracy 0.94
step 19800, train accuracy 0.96
step 19900, train accuracy 0.94
然后,报错了。
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,28,28,32]
意思是用10000张图像测试时,内存泄露(memory leak)。不知道原因,那就把测试图形改成2000张试试。
print "test accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images[:2000], y_:mnist.test.labels[:2000], keep_prob:1.0})
输出:test accuracy 0.9155。
并不能达到官网所说99.2%的精度。
参考http://blog.csdn.net/yhl_leo/article/details/50624471,改变优化方法,将
Adagrad改为
Gradient Descent,重新测试一下。
step 19200, train accuracy 1
step 19300, train accuracy 1
step 19400, train accuracy 1
step 19500, train accuracy 1
step 19600, train accuracy 1
step 19700, train accuracy 1
step 19800, train accuracy 1
step 19900, train accuracy 1
test accuracy 0.9895
精度98.95%,测试样本2000个。
补充:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,28,28,32]
是因为在测试时GPU显存不足导致的,可以将test set分成几个batch分别测试,最后求平均精度。如下:
accuracy_sum = tf.reduce_sum(tf.cast(correct_prediction, tf.float32)) good = 0 total = 0 for i in xrange(10): testSet = mnist.test.next_batch(50) good += accuracy_sum.eval(feed_dict={ x: testSet[0], y_: testSet[1], keep_prob: 1.0}) total += testSet[0].shape[0] print("test accuracy %g"%(good/total))
参考:
1. http://stackoverflow.com/questions/39076388/tensorflow-deep-mnist-resource-exhausted-oom-when-allocating-tensor-with-shape
2. https://github.com/tensorflow/tensorflow/pull/157
补充2
MNIST的训练图像一共有5,5000张,但是训练时使用了20,000个周期(epochs),每个周期的batch size是50,一共就有20000*50=100万张图片参与训练,平均每张图片训练了18次,那么这样的训练方式会提高精度吗,会不会产生过拟合呢?
把周期数由20,000改为2,000。测试5000张,精度为97.8%。
把周期数由20,000改为10,000。测试5000张,精度为98.76%。
把周期数由20,000改为25,000。测试5000张,精度为98.78%。
这三个实验结果的测试精度都比周期为20,000的精度低。
相关文章推荐
- CNN & Tensorflow 入门——以Cifar-10为例
- 第一阶段-入门详细图文讲解tensorflow1.4 -(五)MNIST-CNN
- TensorFlow入门-MNIST & softmax regression
- DeepLearning&Tensorflow学习笔记2__mnist数据集CNN
- TensorFlow安装与入门: 使用CNN训练MNIST
- TensorFlow 入门之手写识别(MNIST) softmax算法
- Tensorflow的MNIST进阶教程CNN网络参数理解
- TensorFlow官方教程学习笔记(三)——MNIST入门(续)
- TensorFlow入门程序MNIST无反应
- Deep Learning-TensorFlow (2) CNN卷积神经网络_TensorBoard可视化使用及MNIST代码实例
- Deep Learning-TensorFlow (1) CNN卷积神经网络_MNIST手写数字识别代码实现详解
- TensorFLow学习(一),Mnist入门
- tensorflow学习笔记五:mnist实例--卷积神经网络(CNN)
- tensorflow中mnist 使用cnn模型训练的输出层数为7x7的原因
- TensorFlow教程06:MNIST的CNN实现——源码和运行结果
- tensorflow学习笔记五:mnist实例--卷积神经网络(CNN)
- [置顶] TensorFlow 入门之训练 MNIST 数据
- tensorflow学习笔记五:mnist实例--卷积神经网络(CNN)
- tensorflow 入门之MNIST
- tensorflow入门Day3-MNIST