您的位置：首页 > 理论基础 > 计算机网络

TensorFlow实战5：利用卷积神经网络对图像分类（初阶：MNIST手写数字）代码实现

2017-11-26 14:30 911 查看

之前用简单的神经网络实现过一次手写数字识别，这次会使用卷积神经网络来进行识别。

普通的神经网络（ANN)来对图像进行识别时，主要有如下缺点：

1. 参数太多

2. 没有利用像素之间的位置关系，对于图线识别任务来说，每个像素与周围的像素都是联系的很紧密的

3. 神经网络的层数受限制

但是利用卷积神经网络来解决图像识别分类问题，就可以避免上述的问题。

此篇文章中实现卷积神经网络对图像进行分类的步骤如下：

1. 准备数据

2. 卷积、激活、池化（两层）

3. 两层全连接层（第一层先特征加权，然后激活；第二层特征加权）

4. 使用softmax和交叉熵计算损失

5. 用梯度下降减少损失，计算准确率

6. 在运行会话时，进行1000次迭代，每100次打印结果

代码如下：

# 生成权重
def weight_variable(shape):
w = tf.Variable(tf.random_normal(shape=shape, mean=0.0, stddev=1.0))
return w

# 生成偏置
def bias_variable(shape):
b = tf.Variable(tf.constant(0.0, shape=shape))
return b

def model():
"""
搭建的模型函数
:return:模型预测值、样本真实值、特征值
"""

# 1、准确数据的输入占位符，x,y
with tf.variable_scope("data"):
# 特征值
x = tf.placeholder(tf.float32, [None, 784])

# 标签值
y_true = tf.placeholder(tf.int32, [None, 10])

# 2、进行卷积层1
with tf.variable_scope("conv_1"):
# 准备参数，权重和偏置
w_conv1 = weight_variable([5, 5, 1, 32])

b_conv1= bias_variable([32])

# 转换输入数据的形状，卷积要求
x_reshape = tf.reshape(x, [-1, 28, 28, 1])

# 卷积，激活，池化
x_relu1 = tf.nn.relu(tf.nn.conv2d(x_reshape, w_conv1, strides=[1, 1, 1, 1], padding="SAME") + b_conv1)

x_pool1 = tf.nn.max_pool(x_relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

# 3、进行卷积层2
with tf.variable_scope("conv_2"):
# 准备参数，权重和偏置,输入通道为上一次卷积激活池化后的Filter数量32, 输出64
w_conv2 = weight_variable([5, 5, 32, 64])

b_conv2 = bias_variable([64])

# 进行卷积、激活、池化
x_relu2 = tf.nn.relu(tf.nn.conv2d(x_pool1, w_conv2, strides=[1, 1, 1, 1], padding="SAME") + b_conv2)

x_pool2 = tf.nn.max_pool(x_relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

# print(x_pool2.get_shape().as_list())      #第二次池化层之后的数据维度[None,7,7,64]

# 4、进行全连接层1
with tf.variable_scope("FC1"):
# 初始化参数，权重和偏置
w_fc1 = weight_variable([7 * 7 * 64, 1024])

b_fc1 = bias_variable([1024])

# 输入数据的形状改变,[None, 7, 7, 64]-->[None, 7*7*64]=x
x_fc1 = tf.reshape(x_pool2, [-1, 7 * 7 * 64])

# 进行第一次全连接计算
x_fc1_relu = tf.nn.relu(tf.matmul(x_fc1, w_fc1) + b_fc1)

# 5、进行全连接层2
with tf.variable_scope("FC2"):
# 初始化参数，权重和偏置
w_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])

# 进行加权求和
y_logit = tf.matmul(x_fc1_relu, w_fc2) + b_fc2

return y_logit, y_true, x

def compute_loss(y_logit, y_true):
"""
计算损失
:param y_logit: 模型预测结果
:param y_true: 样本真实值
:return: 损失loss
"""
with tf.variable_scope("compute_loss"):

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_logit ,labels=y_true))

return loss

def train(loss, y_true, y_logit):
"""
优化损失，计算准确率
:param loss: 损失值
:return: train_op, 准确率
"""
with tf.variable_scope("train"):
# 梯度下降减少损失
train_op = tf.train.GradientDescentOptimizer(0.0001).minimize(loss)

# 计算准确率
# 得出每一个样本是否预测准确1D张量，[1,0,1,1,1,0,1]
equal_list = tf.equal(tf.argmax(y_true, 1), tf.argmax(y_logit, 1))

# 对是否准确的列表求平均值
accuracy = tf.reduce_mean(tf.cast(equal_list, tf.float32))

return train_op, accuracy

def main(argv):
"""
主函数，用来控制整个流程
:param argv:
:return: None
"""

# 导入数据
mnist = input_data.read_data_sets("./data/input_data/", one_hot=True)

# model输出卷积网络的结果
y_logit, y_true, x = model()

# softmax回归和交叉熵损失
loss = compute_loss(y_logit, y_true)

# 梯度下降API减少损失,得出准确率
train_op, accuracy = train(loss, y_true, y_logit)

init_op = tf.global_variables_initializer()

with tf.Session() as sess:

# 初始化变量
sess.run(init_op)

# 迭代训练
for i in range(1000):

# mnist数据，mnist_x特征值，mnist_y标签值
mnist_x, mnist_y = mnist.train.next_batch(50)   #每次给50个数据

sess.run(train_op, feed_dict={x: mnist_x, y_true: mnist_y})

if i % 100 == 0:

print("准确率：",sess.run(accuracy, feed_dict={x: mnist_x, y_true: mnist_y}))

# 测试集准确率
print("测试准确率：",sess.run(accuracy,feed_dict={x:mnist.test.images,y_true:mnist.test.labels}))

return None

if __name__ == '__main__':

tf.app.run()     #此处运行main函数

上段代码最后获得的结果如下图所示：

注：

在搭建模型的整个过程中，数据的形状是在随着层次的不同而变化的，下面就梳理一下：

最开始的数据为[None,784] ，经过reshape变化之后变为[None,28,28,1];

在第一层基础层：

1. 卷积中32个5*5大小的filter，步长为1,padding= 2,经过卷积之后数据变为[None,28,28,32];

2. 激活层中数据大小没有变化；

3. 池化层：ksize =[1,2,2,1],步长为2，经过池化层之后数据变为[None,14,14,32]

第二层基础层：

1. 卷积中64个5*5大小的filter，步长为1,padding= 2,经过卷积之后数据变为[None,14,14,64];

2. 激活层中数据大小没有变化；

3. 池化层：ksize =[1,2,2,1],步长为2，经过池化层之后数据变为[None,7,7,64]

第一次全连接：

先将输入的数据展开，变成[None,7*7*64],设定的权重w为[7*7*64,1024],经过全连接之后，变为[None,1024];

第二次全连接：

权重设定为[1024,10],经过全连接之后，得到最终的值[None,10]

在上面数据的形状变化中，None代表的是每个批次的数据个数（在这里即为每个batch_size中图片的张数），上面在全连接层中出现的1024维是可以根据情况手动设定的，只要统一就可以了。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航