Tensorflow函数:tf.nn.softmax_cross_entropy_with_logits 讲解
2017-04-13 17:22
615 查看
首先把Tensorflow英文API搬过来:
tf.nn.softmax_cross_entropy_with_logits(_sentinel=None, labels=None, logits=None,
dim=-1, name=None)
Computes softmax cross entropy between
Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but
not both.
NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of
a valid probability distribution. If they are not, the computation of the gradient will be incorrect.
If using exclusive
one and only one class is true at a time), see
WARNING: This op expects unscaled logits, since it performs a
for efficiency. Do not call this op with the output of
as it will produce incorrect results.
have the same shape
or
Note that to avoid confusion, it is required to pass only named arguments to this function.
Used to prevent positional parameters. Internal, do not use.
Each row
a valid probability distribution.
Unscaled log probabilities.
The class dimension. Defaulted to -1 which is the last dimension.
A name for the operation (optional).
这个函数至少需要两个参数:labels, logits.
labels:为神经网络期望的输出
logits:为神经网络最后一层的输出
警告:这个函数内部自动计算softmax,然后再计算交叉熵代价函数,也就是说logits必须是没有经过tf.nn.softmax函数处理的数据,否则导致训练结果有问题。建议编程序时使用这个函数,而不必自己编写交叉熵代价函数。
下面是两层CNN识别mnist的softmax回归实验:
实验结果为:
迭代次数没有太多,否则准确率还会提高。
tf.nn.softmax_cross_entropy_with_logits(_sentinel=None, labels=None, logits=None,
dim=-1, name=None)
Computes softmax cross entropy between
logitsand
labels.
Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but
not both.
NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of
labelsis
a valid probability distribution. If they are not, the computation of the gradient will be incorrect.
If using exclusive
labels(wherein
one and only one class is true at a time), see
sparse_softmax_cross_entropy_with_logits.
WARNING: This op expects unscaled logits, since it performs a
softmaxon
logitsinternally
for efficiency. Do not call this op with the output of
softmax,
as it will produce incorrect results.
logitsand
labelsmust
have the same shape
[batch_size, num_classes]and the same dtype (either
float16,
float32,
or
float64).
Note that to avoid confusion, it is required to pass only named arguments to this function.
Args:
_sentinel:
Used to prevent positional parameters. Internal, do not use.
labels:
Each row
labels[i]must be
a valid probability distribution.
logits:
Unscaled log probabilities.
dim:
The class dimension. Defaulted to -1 which is the last dimension.
name:
A name for the operation (optional).
这个函数至少需要两个参数:labels, logits.
labels:为神经网络期望的输出
logits:为神经网络最后一层的输出
警告:这个函数内部自动计算softmax,然后再计算交叉熵代价函数,也就是说logits必须是没有经过tf.nn.softmax函数处理的数据,否则导致训练结果有问题。建议编程序时使用这个函数,而不必自己编写交叉熵代价函数。
下面是两层CNN识别mnist的softmax回归实验:
#coding=utf-8 import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) def compute_accuracy(v_xs,v_ys): global prediction y_pre=sess.run(prediction,feed_dict={xs:v_xs,keep_prob:1}) #这里的keep_prob是保留概率,即我们要保留的RELU的结果所占比例 correct_prediction=tf.equal(tf.argmax(y_pre,1),tf.argmax(v_ys,1)) accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32)) result=sess.run(accuracy,feed_dict={xs:v_xs,ys:v_ys,keep_prob:1}) return result def weight_variable(shape): inital=tf.truncated_normal(shape,stddev=0.1) #stddev爲標準差 return tf.Variable(inital) def bias_variable(shape): inital=tf.constant(0.1,shape=shape) return tf.Variable(inital) def conv2d(x,W): #x爲像素值,W爲權值 #strides[1,x_movement,y_movement,1] #must have strides[0]=strides[3]=1 #padding=???? return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')# def max_pool_2x2(x): # strides[1,x_movement,y_movement,1] return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')#ksize二三维为池化窗口 #define placeholder for inputs to network xs=tf.placeholder(tf.float32,[None,784])/255 ys=tf.placeholder(tf.float32,[None,10]) keep_prob=tf.placeholder(tf.float32) x_image=tf.reshape(xs, [-1,28,28,1]) #-1为这个维度不确定,变成一个4维的矩阵,最后为最里面的维数 #print x_image.shape #最后这个1理解为输入的channel,因为为黑白色所以为1 ##conv1 layer## W_conv1=weight_variable([5,5,1,32]) #patch 5x5,in size 1 是image的厚度,outsize 32 是提取的特征的维数 b_conv1=bias_variable([32]) h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)# output size 28x28x32 因为padding='SAME' h_pool1=max_pool_2x2(h_conv1) #output size 14x14x32 ##conv2 layer## W_conv2=weight_variable([5,5,32,64]) #patch 5x5,in size 32 是conv1的厚度,outsize 64 是提取的特征的维数 b_conv2=bias_variable([64]) h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)# output size 14x14x64 因为padding='SAME' h_pool2=max_pool_2x2(h_conv2) #output size 7x7x64 ##func1 layer## W_fc1=weight_variable([7*7*64,1024]) b_fc1=bias_variable([1024]) #[n_samples,7,7,64]->>[n_samples,7*7*64] h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64]) h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1) h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob) #防止过拟合 ##func2 layer## W_fc2=weight_variable([1024,10]) b_fc2=bias_variable([10]) #prediction=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2) prediction=tf.matmul(h_fc1_drop,W_fc2)+b_fc2
#h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob) #防止过拟合 #the errro between prediction and real data #cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),reduction_indices=[1]))
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=ys, logits=prediction) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) sess=tf.Session() sess.run(tf.global_variables_initializer()) for i in range(1000): batch_xs,batch_ys=mnist.train.next_batch(100) sess.run(train_step,feed_dict={xs:batch_xs,ys:batch_ys,keep_prob:0.5}) if i%50 94ed ==0: accuracy = 0 for j in range(10): test_batch = mnist.test.next_batch(1000) acc_forone=compute_accuracy(test_batch[0], test_batch[1]) #print 'once=%f' %(acc_forone) accuracy=acc_forone+accuracy print '测试结果:batch:%g,准确率:%f' %(i,accuracy/10)
实验结果为:
测试结果:batch:0,准确率:0.090000 测试结果:batch:50,准确率:0.788600 测试结果:batch:100,准确率:0.880200 测试结果:batch:150,准确率:0.904600 测试结果:batch:200,准确率:0.927500 测试结果:batch:250,准确率:0.929800 测试结果:batch:300,准确率:0.939600 测试结果:batch:350,准确率:0.942100 测试结果:batch:400,准确率:0.950600 测试结果:batch:450,准确率:0.950700 测试结果:batch:500,准确率:0.956700 测试结果:batch:550,准确率:0.956000 测试结果:batch:600,准确率:0.957100 测试结果:batch:650,准确率:0.958400 测试结果:batch:700,准确率:0.961500 测试结果:batch:750,准确率:0.963800 测试结果:batch:800,准确率:0.965000 测试结果:batch:850,准确率:0.966300 测试结果:batch:900,准确率:0.967800 测试结果:batch:950,准确率:0.967700
迭代次数没有太多,否则准确率还会提高。
相关文章推荐
- tensorflow 笔记10:tf.nn.sparse_softmax_cross_entropy_with_logits 函数
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- TensorFlow学习笔记之tf.nn.softmax()与tf.nn.softmax_cross_entropy_with_logits的用法
- 对比两个函数tf.nn.softmax_cross_entropy_with_logits和tf.nn.sparse_softmax_cross_entropy_with_logits
- tensorflow--tf.nn.softmax_cross_entropy_with_logits的用法
- TensorFlow学习--tf.nn.sparse_softmax_cross_entropy_with_logits
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- tf.nn.sparse_softmax_cross_entropy_with_logits()函数的用法
- [TensorFlow] tf.nn.softmax_cross_entropy_with_logits的用法
- TensorFlow中的tf.nn.softmax_cross_entropy_with_logits 交叉熵 损失函数
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- TensorFlow 介绍 tf.nn.softmax_cross_entropy_with_logits 的用法
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- 【TensorFlow】tf.nn.softmax_cross_entropy_with_logits的用法
- [tensorflow] tf.nn.sparse_softmax_cross_entropy_with_logits的使用方法及常见报错
- tf.nn.sparse_softmax_cross_entropy_with_logits的用法