利用tensorflow在mnist上训练和测试LeNet模型
2018-03-10 09:12
666 查看
1. MNIST 数据集的下载及其介绍
MNIST数据集分成两部分:60000行的训练数据集(mnist.train)和10000行的测试数据集(mnist.test)。每一个MNIST数据单元有两部分组成:一张包含手写数字的图片和一个对应的标签。训练数据集的图片是 mnist.train.images ,训练数据集的标签是 mnist.train.labels。每一张图片包含28X28个像素点。把这个数组展开成一个向量,长度是 28x28 = 784。因此,在MNIST训练数据集中,mnist.train.images 是一个形状为 [60000, 784] 的张量,第一个维度数字用来索引图片,第二个维度数字用来索引每张图片中的像素点。在此张量里的每一个元素,都表示某张图片里的某个像素的强度值,值介于0和1之间。相对应的MNIST数据集的标签是介于0到9的数字,用来描述给定图片里表示的数字。此处使用的标签数据是”one-hot vectors”。 一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0。所以,数字n将表示成一个只有在第n维度(从0开始)数字为1的10维向量。比如,标签0将表示成([1,0,0,0,0,0,0,0,0,0,0])。因此, mnist.train.labels 是一个 [60000, 10] 的数字矩阵。2. 实现过程
2.1 tensorflow 环境
若集群未事先装有tensorflow模块,可利用cacheArchive参数特性进行配置,方法如下:- 打包TensorFlow的库,它依赖的那些库可以先在环境安装,也可以将所有依赖的一起打包。如:tar -zcvf tensorflow.tgz ./*
- 上传该压缩包至hdfs,如放置在hdfs的/tmp/tensorflow.tgz
- xlearning提交脚本中,添加cacheArchive参数,如: –cacheArchive /tmp/tensorflow.tgz#tensorflow
- 在launch-cmd中所执行的脚本中,添加环境变量设置:export PYTHONPATH=./:$PYTHONPATH
tensorflow依赖库安装
yum install numpy python-devel python-wheel
2.2 训练模型
进入目录cd /var/lib/ambari-server/resources/stacks/CRH/5.1/services/XLEARNING/xlearning-1.2/examples/tfmnist export XLEARNING_HOME=/var/lib/ambari-server/resources/stacks/CRH/5.1/services/XLEARNING/xlearning-1.2
运行脚本run.sh
#!/bin/sh $XLEARNING_HOME/bin/xl-submit \ --app-type "tensorflow" \ --app-name "tf-mnist" \ --input /tmp/data/tfmnist/MNIST_data#data \ --output /tmp/tfmnist_model#model \ --files demo.py,input_data.py,demo.sh \ --cacheArchive /tmp/tensorflow.tgz#tensorflow \ --launch-cmd "sh demo.sh" \ --worker-memory 2G \ --worker-num 2 \ --worker-cores 3 \ --ps-memory 2G \ --ps-num 1 \ --ps-cores 2 \ --queue default \
demo.sh脚本
export PYTHONPATH=./:$PYTHONPATH python demo.py --data_path=./data --save_path=./model --log_dir=./eventLog
demo.py代码
import argparse import sys import os import json import numpy as np import time sys.path.append(os.getcwd()) import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) import tensorflow as tf FLAGS = None def main(_): # cluster specification FLAGS.task_index = int(os.environ["TF_INDEX"]) FLAGS.job_name = os.environ["TF_ROLE"] cluster_def = json.loads(os.environ["TF_CLUSTER_DEF"]) cluster = tf.train.ClusterSpec(cluster_def) #sess = tf.InteractiveSession() print("ClusterSpec:", cluster_def) print("current task id:", FLAGS.task_index, " role:", FLAGS.job_name) gpu_options = tf.GPUOptions(allow_growth=True) server = tf.train.Server(cluster, job_name=FLAGS.job_name, task_index=FLAGS.task_index, config=tf.ConfigProto(gpu_options=gpu_options, allow_soft_placement=True)) if FLAGS.job_name == "ps": server.join() elif FLAGS.job_name == "worker": # set the train parameters with tf.device(tf.train.replica_device_setter(worker_device=("/job:worker/task:%d" % (FLAGS.task_index)), cluster=cluster)): global_step = tf.get_variable('global_step', [] 4000 , initializer=tf.constant_initializer(0), trainable=False) x = tf.placeholder(tf.float32, shape=[None, 784]) y_ = tf.placeholder(tf.float32, shape=[None, 10]) W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10])) #sess.run(tf.global_variables_initializer()) y = tf.matmul(x, W) + b cross_entropy = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)) train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) x_image = tf.reshape(x, [-1, 28, 28, 1]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) keep_prob = tf.placeholder(tf.float32) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2 cross_entropy = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) init_op = tf.global_variables_initializer() saver = tf.train.Saver() # defaults to saving all variables sv = tf.train.Supervisor(is_chief=(FLAGS.task_index == 0), global_step=global_step, init_op=init_op) with sv.prepare_or_wait_for_session(server.target, config=tf.ConfigProto(gpu_options=gpu_options, allow_soft_placement=True, log_device_placement=True)) as sess: # perform training cycles start_time = time.time() if (FLAGS.task_index == 0): train_writer = tf.summary.FileWriter(FLAGS.log_dir, sess.graph) sess.run(init_op) for i in range(20000): batch = mnist.train.next_batch(50) elapsed_time = time.time() - start_time start_time = time.time() if i % 100 == 0: train_accuracy = accuracy.eval(feed_dict={ x: batch[0], y_: batch[1], keep_prob: 1.0}) print("step %d, training accuracy %g, Time: %3.2fms" % (i, train_accuracy, float(elapsed_time*1000))) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) sys.stderr.write("reporter progress:%0.4f\n"%(float(i/20000))) print("test accuracy %g" % accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})) print("Train Completed.") if (FLAGS.task_index == 0): train_writer.close() print("saving model...") saver.save(sess, FLAGS.save_path+"/model.ckpt") print("done") if __name__ == "__main__": parser = argparse.ArgumentParser() parser.register("type", "bool", lambda v: v.lower() == "true") # Flags for defining the tf.train.ClusterSpec parser.add_argument( "--job_name", type=str, default="", help="One of 'ps', 'worker'" ) # Flags for defining the tf.train.Server parser.add_argument( "--task_index", type=int, default=0, help="Index of task within the job" ) # Flags for defining the parameter of data path parser.add_argument( "--data_path", type=str, default="", help="The path for train file" ) parser.add_argument( "--save_path", type=str, default="", help="The save path for model" ) parser.add_argument( "--log_dir", type=str, default="", help="The log path for model" ) FLAGS, unparsed = parser.parse_known_args() tf.app.run(main=main)
注:saver部分将训练的权重和偏置保存下来,在评价程序中可以再次使用。
2.3 准备测试图片,用Opencv进行预处理
训练好了网络,下一步就要测试它了。准备一张图片,然后用Opencv预处理一下再放到评价程序里,看看能不能准确识别。使用的是Opencv对图像进行预处理,缩小它的大小为28*28像素,并转变为灰度图,进行二值化处理。
(1) stdafx.h文件
添加opencv相关的头文件
#include <opencv2/highgui/highgui.hpp> #include <opencv2/opencv.hpp> #include <opencv2/gpu/gpu.hpp> #include <opencv2/core/core.hpp> #include <opencv/cv.h> #include <opencv/cxcore.h> #include <opencv/highgui.h>
(2)TF_ImgPreProcess.cpp文件
#include "stdafx.h" #include <opencv2/core/core.hpp> #include <opencv2/core/opengl_interop.hpp> #include <opencv2/gpu/gpu.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/contrib/contrib.hpp> using namespace std; using namespace cv; int _tmain(int argc, _TCHAR* argv[]) { IplImage* img = cvLoadImage("E:\\png\\5.png",1); IplImage* copyImg=cvCreateImage(cvGetSize(img),IPL_DEPTH_8U,3); cvCopyImage(img,copyImg); IplImage* ResImg=cvCreateImage(cvSize(28,28),IPL_DEPTH_8U,1); IplImage* TmpImg=cvCreateImage(cvGetSize(ResImg),IPL_DEPTH_8U,3); cvResize(copyImg,TmpImg,CV_INTER_LINEAR); cvCvtColor(TmpImg,ResImg,CV_RGB2GRAY); cvThreshold(ResImg,ResImg,100,255,CV_THRESH_BINARY_INV); cvSaveImage("E:\\png\\result\\1.png",ResImg); cvWaitKey(0); return 0; }
2.4 将图片输入网络进行识别
在环境中安装opencv包yum install opencv-python -y
这里编写了一个前向传播的程序,最后softmax层分类的结果就是最后的识别结果。
程序如下:
“`python
from PIL import Image, ImageFilter
import tensorflow as tf
import cv2
def imageprepare():
“””
This function returns the pixel values.
The imput is a png file location.
“””
file_name=’/data/sxl/MNIST_recognize/p_num2.png’#导入自己的图片地址
#in terminal ‘mogrify -format png *.jpg’ convert jpg to png
im = Image.open(file_name).convert(‘L’)
im.save("/data/sxl/MNIST_recognize/sample.png") tv = list(im.getdata()) #get pixel values #normalize pixels to 0 and 1. 0 is pure white, 1 is pure black. tva = [ (255-x)*1.0/255.0 for x in tv] #print(tva) return tva """ This function returns the predicted integer. The imput is the pixel values from the imageprepare() function. """ # Define the model (same as when creating the model file)
result=imageprepare()
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding=’SAME’)
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=’SAME’)
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
init_op = tf.initialize_all_variables()
init_op = tf.global_variables_initializer()“””
Load the model2.ckpt file
file is stored in the same directory as this python script is started
Use the model to predict the integer. Integer is returend as list.
Based on the documentatoin at
https://www.tensorflow.org/versions/master/how_tos/variables/index.html
“””
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(init_op)
saver.restore(sess, “/data/sxl/MNIST
b0b5
_recognize/form/model2.ckpt”)#这里使用了之前保存的模型参数
#print (“Model restored.”)
prediction=tf.argmax(y_conv,1) predint=prediction. print(h_conv2) print('recognize result:')
print(predint[0])
输入图片为: ![](/upload/images/20180309//f8c775df-a50b-4278-a2aa-ef51653938a1.png) 运行结果为: ![](/upload/images/20180309//be8605a5-d009-4156-b1d7-d423d35797de.png) 说明: tensorflow模型保存为: ```python saver = tf.train.Saver() with tf.Session() as sess: init_op = tf.global_variables_initializer() saver.save(sess,"checkpoint/model.ckpt",global_step=1) <div class="se-preview-section-delimiter"></div>
运行后,保存模型保存,得到三个文件,分别为.data,.meta,.index,
model.ckpt.data-00000-of-00001
model.ckpt.index
model.ckpt.meta
meta file保存了graph结构,包括 GraphDef, SaverDef等.
index file为一个 string-string table,table的key值为tensor名,value为BundleEntryProto, BundleEntryProto.
data file保存了模型的所有变量的值.
模型加载为:
with tf.Session() as sess: saver.restore(sess, "/checkpoint/model.ckpt")
运行后,保存模型保存,得到三个文件,分别为.data,.meta,.index,
model.ckpt.data-00000-of-00001
model.ckpt.index
model.ckpt.meta
meta file保存了graph结构,包括 GraphDef, SaverDef等.
index file为一个 string-string table,table的key值为tensor名,value为BundleEntryProto, BundleEntryProto.
data file保存了模型的所有变量的值.
模型加载为:
```python
with tf.Session() as sess: saver.restore(sess, "/checkpoint/model.ckpt")
更多精彩原创文章,详见红象云腾社区
相关文章推荐
- caffe(二): 利用训练好的MNIST模型测试自己的手写字符图片
- 用tensorflow框架和Mnist手写字体,训练cnn模型以及测试一张手写字体
- Tensorflow学习教程------利用卷积神经网络对mnist数据集进行分类_利用训练好的模型进行分类
- caffe之利用mnist数据集训练好的lenet_iter_10000.caffemodel模型测试一张自己的手写体数字
- caffe之利用mnist数据集训练好的lenet_iter_10000.caffemodel模型测试一张自己的手写体数字
- Tensorflow学习教程------利用卷积神经网络对mnist数据集进行分类_训练模型
- 利用tensorflow训练自己的图片数据(5)——测试训练网络
- Tensorflow利用训练好的Inception模型进行图像识别分类
- tensorflow如何使用训练好的模型做测试
- TensorFlow——训练自己的数据(四)模型测试
- Mxnet图片分类(4)利用训练好的模型进行测试
- Tensorflow实战1:利用AlexNet训练MNIST
- TensorFlow实现人脸识别(5)-------利用训练好的模型实时进行人脸检测
- 用caffe训练好的lenet_iter_10000.caffemodel测试单张mnist图片
- Tensorflow之用自己的训练好的cpkt模型,进行测试识别
- TensorFlow教程03:MNIST实验——回归的实现、训练和模型评估
- 【caffe】mnist数据集lenet训练与测试
- 利用TensorFlow训练简单的二分类神经网络模型的方法
- caffe--python版利用训练好模型进行测试
- caffe利用训练好的模型进行实际测试