您的位置：首页 > 运维架构

Tensorflow函数：tf.nn.softmax_cross_entropy_with_logits 讲解

2017-04-13 17:22 615 查看

首先把Tensorflow英文API搬过来：

tf.nn.softmax_cross_entropy_with_logits(_sentinel=None, labels=None, logits=None,
dim=-1, name=None)

Computes softmax cross entropy between

logits

and

labels

.

Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but
not both.

NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of

labels

is
a valid probability distribution. If they are not, the computation of the gradient will be incorrect.

If using exclusive

labels

(wherein
one and only one class is true at a time), see

sparse_softmax_cross_entropy_with_logits

.

WARNING: This op expects unscaled logits, since it performs a

softmax

logits

internally
for efficiency. Do not call this op with the output of

softmax

,
as it will produce incorrect results.

logits

and

labels

must
have the same shape

[batch_size,
num_classes]

and the same dtype (either

float16

float32

,
or

float64

).

Note that to avoid confusion, it is required to pass only named arguments to this function.

Args:

_sentinel

:
Used to prevent positional parameters. Internal, do not use.

labels

:
Each row

labels[i]

must be
a valid probability distribution.

logits

:
Unscaled log probabilities.

dim

:
The class dimension. Defaulted to -1 which is the last dimension.

name

:
A name for the operation (optional).

这个函数至少需要两个参数:labels, logits.
labels：为神经网络期望的输出

logits：为神经网络最后一层的输出

警告：这个函数内部自动计算softmax，然后再计算交叉熵代价函数，也就是说logits必须是没有经过tf.nn.softmax函数处理的数据，否则导致训练结果有问题。建议编程序时使用这个函数，而不必自己编写交叉熵代价函数。

下面是两层CNN识别mnist的softmax回归实验:

#coding=utf-8
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

def compute_accuracy(v_xs,v_ys):
global prediction
y_pre=sess.run(prediction,feed_dict={xs:v_xs,keep_prob:1}) #这里的keep_prob是保留概率，即我们要保留的RELU的结果所占比例
correct_prediction=tf.equal(tf.argmax(y_pre,1),tf.argmax(v_ys,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
result=sess.run(accuracy,feed_dict={xs:v_xs,ys:v_ys,keep_prob:1})
return result

def weight_variable(shape):
inital=tf.truncated_normal(shape,stddev=0.1)     #stddev爲標準差
return tf.Variable(inital)

def bias_variable(shape):
inital=tf.constant(0.1,shape=shape)
return tf.Variable(inital)

def conv2d(x,W):    #x爲像素值，W爲權值
#strides[1,x_movement,y_movement,1]
#must have strides[0]=strides[3]=1
#padding=????
return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')#

def max_pool_2x2(x):
# strides[1,x_movement,y_movement,1]
return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')#ksize二三维为池化窗口

#define placeholder for inputs to network
xs=tf.placeholder(tf.float32,[None,784])/255
ys=tf.placeholder(tf.float32,[None,10])
keep_prob=tf.placeholder(tf.float32)
x_image=tf.reshape(xs, [-1,28,28,1]) #-1为这个维度不确定,变成一个4维的矩阵，最后为最里面的维数
#print x_image.shape                 #最后这个1理解为输入的channel，因为为黑白色所以为1

##conv1 layer##
W_conv1=weight_variable([5,5,1,32]) #patch 5x5,in size 1 是image的厚度,outsize 32 是提取的特征的维数
b_conv1=bias_variable([32])
h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)# output size 28x28x32 因为padding='SAME'
h_pool1=max_pool_2x2(h_conv1)      #output size 14x14x32

##conv2 layer##
W_conv2=weight_variable([5,5,32,64]) #patch 5x5,in size 32 是conv1的厚度,outsize 64 是提取的特征的维数
b_conv2=bias_variable([64])
h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)# output size 14x14x64 因为padding='SAME'
h_pool2=max_pool_2x2(h_conv2)      #output size 7x7x64

##func1 layer##
W_fc1=weight_variable([7*7*64,1024])
b_fc1=bias_variable([1024])
#[n_samples,7,7,64]->>[n_samples,7*7*64]
h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])
h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)
h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)  #防止过拟合

##func2 layer##
W_fc2=weight_variable([1024,10])
b_fc2=bias_variable([10])
#prediction=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)
prediction=tf.matmul(h_fc1_drop,W_fc2)+b_fc2

#h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)  #防止过拟合

#the errro between prediction and real data

#cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),reduction_indices=[1]))

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=ys, logits=prediction)
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
sess=tf.Session()
sess.run(tf.global_variables_initializer())

for i in range(1000):
batch_xs,batch_ys=mnist.train.next_batch(100)
sess.run(train_step,feed_dict={xs:batch_xs,ys:batch_ys,keep_prob:0.5})
if i%50
94ed
==0:
accuracy = 0
for j in range(10):
test_batch = mnist.test.next_batch(1000)
acc_forone=compute_accuracy(test_batch[0], test_batch[1])
#print 'once=%f' %(acc_forone)
accuracy=acc_forone+accuracy
print '测试结果:batch:%g,准确率:%f' %(i,accuracy/10)

实验结果为：

测试结果:batch:0,准确率:0.090000
测试结果:batch:50,准确率:0.788600
测试结果:batch:100,准确率:0.880200
测试结果:batch:150,准确率:0.904600
测试结果:batch:200,准确率:0.927500
测试结果:batch:250,准确率:0.929800
测试结果:batch:300,准确率:0.939600
测试结果:batch:350,准确率:0.942100
测试结果:batch:400,准确率:0.950600
测试结果:batch:450,准确率:0.950700
测试结果:batch:500,准确率:0.956700
测试结果:batch:550,准确率:0.956000
测试结果:batch:600,准确率:0.957100
测试结果:batch:650,准确率:0.958400
测试结果:batch:700,准确率:0.961500
测试结果:batch:750,准确率:0.963800
测试结果:batch:800,准确率:0.965000
测试结果:batch:850,准确率:0.966300
测试结果:batch:900,准确率:0.967800
测试结果:batch:950,准确率:0.967700

迭代次数没有太多，否则准确率还会提高。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： tensorflow python CNN softmax

相关文章推荐

新的分享

章节导航