深度学习-实现提高版本的手写数字识别算法
2018-01-26 16:17
411 查看
学习彭亮《深度学习进阶:算法与应用》课程
但是新方法更快把精确度提高了 (87 vs 93)
我们从以下方面做了提高:
Cost函数: cross-entropy
Regularization: L1, L2
Softmax layer
初始化 1/sqrt(n_in)
新的cost function
为什么把cost实现在一个类里面而不是一个function?
计算cost有两个作用:
1. 衡量网络输出的值和理想预期值的匹配程度
2. 在用backprogapation计算偏导数的时候, 需要计算
准确率最高能达到:96.32%,相同参数下比之前的network.py要高
用不同的初始化权重方法对比
1.对于隐藏层有30个神经元的对比:
(1)之前的方法:N(0,1)
N(0,1),即均值为0,方差为1的标准正太分布import mnist_loader training_data, validation_data, test_data = mnist_loader.load_data_wrapper() import network2 net = network2.Network([784, 30, 10], cost=network2.CrossEntropyCost) net.large_weight_initializer() net.SGD(training_data, 30, 10, 0.1,lmbda=5.0, evaluation_data=validation_data,monitor_evaluation_accuracy=True)
(2)新方法:N(0, 1/sqrt(n_in))
import mnist_loader training_data, validation_data, test_data = mnist_loader.load_data_wrapper() import network2 net = network2.Network([784, 30, 10], cost=network2.CrossEntropyCost) # net.large_weight_initializer() -----少了这句----- net.SGD(training_data, 30, 10, 0.1,lmbda=5.0, evaluation_data=validation_data,monitor_evaluation_accuracy=True)
结论:
两种方法都得到了高于96%的accuracy但是新方法更快把精确度提高了 (87 vs 93)
2.对于隐藏层有100个神经元的对比:
结论
从这个例子中看到新的初始化方法只是增快了学习的速率, 最终表现是一样的, 但在有些神经网络中, 新的初始化权重的方法会提高最终的accuracy实现提高版本的神经网络算法来识别手写数字:
复习之前原始的版本: Network.py我们从以下方面做了提高:
Cost函数: cross-entropy
Regularization: L1, L2
Softmax layer
初始化 1/sqrt(n_in)
老的Network.py
#coding=utf-8 # @Author: yangenneng # @Time: 2018-01-23 16:39 # @Abstract:梯度下降算法实现手写数字识别 import numpy as np import random # 定义一个神经网络类 class Network(object): # 功能:构造函数 # sizes: 每层神经元的个数, 例如:net = Network([2, 3, 1])表示 第一层2个神经元,第二层3个神经元: def __init__(self, sizes): # 神经网络层数 self.num_layers = len(sizes) # 每层神经元的个数 self.sizes = sizes # np.random.rand(y, 1): 随机从正态分布(均值0, 方差1)中生成 ;for y in sizes[1:]:除去第一个数,因为biase是从隐藏层到输出层,输入层没有 self.biases = [np.random.randn(y, 1) for y in sizes[1:]] # net.weights[1] 存储连接第二层和第三层的权重(Python索引从0开始数) zip是指传入的可循环的两组量产生一组新的量 self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] # 功能:向前传递神经网络,把输入传进去之后计算输出 def feedforward(self, a): for b, w in zip(self.biases, self.weights): # np.dot(w, a) w向量与a向量点乘 eg:w(w1,w2,w3) a(a1,a2,a3) => w1*a1+w2*a2+w3*a3 a = sigmoid(np.dot(w, a) + b) return a """" # 功能:随机梯度下降算法 (stochastic gradient descent) # training_data:训练集,是很多tuples (X,Y)list,X是1*784,Y是真实结果 # epochs:训练多少轮 # mini_batch_size:每一轮的数据有多少个实例 # eta:学习率 # test_data:测试集 """ def SGD(self, training_data, epochs, mini_batch_size, eta,test_data=None): # 如果存在test_data,则返回ture,执行n_test = len(test_data),即计算测试集的行数是多少 if test_data: n_test = len(test_data) # 训练集有多少个tuple,每个tuple对应一个训练集的图片 n = len(training_data) # 循环训练epochs轮,j为当前第几轮 for j in xrange(epochs): # shuffle洗牌,即随机打乱训练集 random.shuffle(training_data) # 取每轮的测试集 mini_batches = [ # 从0-n中每次间隔mini_batch_size取值 training_data[k:k + mini_batch_size] for k in xrange(0, n, mini_batch_size)] # 对每一个mini_batch进行更新 for mini_batch in mini_batches: # 更新weight和baise eta为学习率 self.update_mini_batch(mini_batch, eta) # 如果传递了测试集,评估一下当前的准确率 if test_data: # ,如果有测试集,第j轮,测试集的预测准确性 print "Epoch {0}: {1} / {2}".format(j, self.evaluate(test_data), n_test) else: # 第j轮结束 print "Epoch {0} complete".format(j) # 更新权重和偏向 def update_mini_batch(self, mini_batch, eta): # 初始化baise nabla_b = [np.zeros(b.shape) for b in self.biases] # 初始化weights nabla_w = [np.zeros(w.shape) for w in self.weights] # x:1*784 y:10*1 for x, y in mini_batch: # Backpropagation算法计算目标函数权重和偏向的偏导数 delta_nabla_b, delta_nabla_w = self.backprop(x, y) # 累积起所有的权重和偏向 nabla_b = [nb + dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] nabla_w = [nw + dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] # 更新权重和偏向 self.weights = [w - (eta / len(mini_batch)) * nw for w, nw in zip(self.weights, nabla_w)] self.biases = [b - (eta / len(mini_batch)) * nb for b, nb in zip(self.biases, nabla_b)] # Backpropagation算法计算cost function对biase和weight的偏导数 # x:784*1 # y:10*1 def backprop(self, x, y): # 初始化两个矩阵 nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] # 正向传递,输入 x: 设置输入层activation a activation = x activations = [x] # 转化成list zs = [] # z为中间变量 for b, w in zip(self.biases, self.weights): # 中间变量Z就是点乘之和+偏向 z = np.dot(w, activation) + b zs.append(z) # 计算activation activation = sigmoid(z) activations.append(activation) # 反向更新 # 计算出输出层error: activations[-1]表示list中最后一层的activation值,即输出层的 delta = self.cost_derivative(activations[-1], y) * \ sigmoid_prime(zs[-1]) # 更新输出层 nabla_b[-1] = delta nabla_w[-1] = np.dot(delta, activations[-2].transpose()) # 从输出层往回更新,用的时候都是-L for l in xrange(2, self.num_layers): # z=zs最后一层的值 z = zs[-l] # sigmoid求导数 sp = sigmoid_prime(z) # 更新delta delta = np.dot(self.weights[-l + 1].transpose(), delta) * sp # 算出偏导 nabla_b[-l] = delta nabla_w[-l] = np.dot(delta, activations[-l - 1].transpose()) return (nabla_b, nabla_w) # 每一步训练完后的准确率更新 def evaluate(self, test_data): test_results = [(np.argmax(self.feedforward(x)), y) for (x, y) in test_data] # 整体输出的y和真实的有多少是相等的,即正确识别了多少个数字图片 return sum(int(x == y) for (x, y) in test_results) # 求偏导的方程 def cost_derivative(self, output_activations, y): return (output_activations - y) # Sigmoid函数,即f(x)=1/(1+e-x).神经元的非线性作用函数. def sigmoid(z): return 1.0/(1.0+np.exp(-z)) # 对Sigmoid函数求一阶导 def sigmoid_prime(z): return sigmoid(z)*(1-sigmoid(z)) net = Network([2, 3, 1]) # print "net.num_layers:",net.num_layers # print "\nnet.sizes:",net.sizes # print "\nnet.biases:",net.biases # print "\nnet.weights:",net.weights
新的Network2.py
""" network2.py ~~~~~~~~~~~~~~ An improved version of network.py, implementing the stochastic gradient descent learning algorithm for a feedforward neural network. Improvements include the addition of the cross-entropy cost function, regularization, and better initialization of network weights. Note that I have focused on making the code simple, easily readable, and easily modifiable. It is not optimized, and omits many desirable features. """ #### Libraries # Standard library import json import random import sys # Third-party libraries import numpy as np #### Define the quadratic and cross-entropy cost functions class QuadraticCost(object): @staticmethod def fn(a, y): """Return the cost associated with an output ``a`` and desired output ``y``. """ return 0.5*np.linalg.norm(a-y)**2 @staticmethod def delta(z, a, y): """Return the error delta from the output layer.""" return (a-y) * sigmoid_prime(z) class CrossEntropyCost(object): @staticmethod def fn(a, y): """Return the cost associated with an output ``a`` and desired output ``y``. Note that np.nan_to_num is used to ensure numerical stability. In particular, if both ``a`` and ``y`` have a 1.0 in the same slot, then the expression (1-y)*np.log(1-a) returns nan. The np.nan_to_num ensures that that is converted to the correct value (0.0). """ return np.sum(np.nan_to_num(-y*np.log(a)-(1-y)*np.log(1-a))) @staticmethod def delta(z, a, y): """Return the error delta from the output layer. Note that the parameter ``z`` is not used by the method. It is included in the method's parameters in order to make the interface consistent with the delta method for other cost classes. """ return (a-y) #### Main Network class class Network(object): def __init__(self, sizes, cost=CrossEntropyCost): """The list ``sizes`` contains the number of neurons in the respective layers of the network. For example, if the list was [2, 3, 1] then it would be a three-layer network, with the first layer containing 2 neurons, the second layer 3 neurons, and the third layer 1 neuron. The biases and weights for the network are initialized randomly, using ``self.default_weight_initializer`` (see docstring for that method). """ self.num_layers = len(sizes) self.sizes = sizes self.default_weight_initializer() self.cost=cost def default_weight_initializer(self): """Initialize each weight using a Gaussian distribution with mean 0 and standard deviation 1 over the square root of the number of weights connecting to the same neuron. Initialize the biases using a Gaussian distribution with mean 0 and standard deviation 1. Note that the first layer is assumed to be an input layer, and by convention we won't set any biases for those neurons, since biases are only ever used in computing the outputs from later layers. """ self.biases = [np.random.randn(y, 1) for y in self.sizes[1:]] self.weights = [np.random.randn(y, x)/np.sqrt(x) for x, y in zip(self.sizes[:-1], self.sizes[1:])] def large_weight_initializer(self): """Initialize the weights using a Gaussian distribution with mean 0 and standard deviation 1. Initialize the biases using a Gaussian distribution with mean 0 and standard deviation 1. Note that the first layer is assumed to be an input layer, and by convention we won't set any biases for those neurons, since biases are only ever used in computing the outputs from later layers. This weight and bias initializer uses the same approach as in Chapter 1, and is included for purposes of comparison. It will usually be better to use the default weight initializer instead. """ self.biases = [np.random.randn(y, 1) for y in self.sizes[1:]] self.weights = [np.random.randn(y, x) for x, y in zip(self.sizes[:-1], self.sizes[1:])] def feedforward(self, a): """Return the output of the network if ``a`` is input.""" for b, w in zip(self.biases, self.weights): a = sigmoid(np.dot(w, a)+b) return a def SGD(self, training_data, epochs, mini_batch_size, eta, lmbda = 0.0, evaluation_data=None, monitor_evaluation_cost=False, monitor_evaluation_accuracy=False, monitor_training_cost=False, monitor_training_accuracy=False): """Train the neural network using mini-batch stochastic gradient descent. The ``training_data`` is a list of tuples ``(x, y)`` representing the training inputs and the desired outputs. The other non-optional parameters are self-explanatory, as is the regularization parameter ``lmbda``. The method also accepts ``evaluation_data``, usually either the validation or test data. We can monitor the cost and accuracy on either the evaluation data or the training data, by setting the appropriate flags. The method returns a tuple containing four lists: the (per-epoch) costs on the evaluation data, the accuracies on the evaluation data, the costs on the training data, and the accuracies on the training data. All values are evaluated at the end of each training epoch. So, for example, if we train for 30 epochs, then the first element of the tuple will be a 30-element list containing the cost on the evaluation data at the end of each epoch. Note that the lists are empty if the corresponding flag is not set. """ if evaluation_data: n_data = len(evaluation_data) n = len(training_data) evaluation_cost, evaluation_accuracy = [], [] training_cost, training_accuracy = [], [] for j in xrange(epochs): random.shuffle(training_data) mini_batches = [ training_data[k:k+mini_batch_size] for k in xrange(0, n, mini_batch_size)] for mini_batch in mini_batches: self.update_mini_batch( mini_batch, eta, lmbda, len(training_data)) print "Epoch %s training complete" % j if monitor_training_cost: cost = self.total_cost(training_data, lmbda) training_cost.append(cost) print "Cost on training data: {}".format(cost) if monitor_training_accuracy: accuracy = self.accuracy(training_data, convert=True) training_accuracy.append(accuracy) print "Accuracy on training data: {} / {}".format( accuracy, n) if monitor_evaluation_cost: cost = self.total_cost(evaluation_data, lmbda, convert=True) evaluation_cost.append(cost) print "Cost on evaluation data: {}".format(cost) if monitor_evaluation_accuracy: accuracy = self.accuracy(evaluation_data) evaluation_accuracy.append(accuracy) print "Accuracy on evaluation data: {} / {}".format( self.accuracy(evaluation_data), n_data) print return evaluation_cost, evaluation_accuracy, \ training_cost, training_accuracy def update_mini_batch(self, mini_batch, eta, lmbda, n): """Update the network's weights and biases by applying gradient descent using backpropagation to a single mini batch. The ``mini_batch`` is a list of tuples ``(x, y)``, ``eta`` is the learning rate, ``lmbda`` is the regularization parameter, and ``n`` is the total size of the training data set. """ nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] for x, y in mini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x, y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] self.weights = [(1-eta*(lmbda/n))*w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)] self.biases = [b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases, nabla_b)] def backprop(self, x, y): """Return a tuple ``(nabla_b, nabla_w)`` representing the gradient for the cost function C_x. ``nabla_b`` and ``nabla_w`` are layer-by-layer lists of numpy arrays, similar to ``self.biases`` and ``self.weights``.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] # feedforward activation = x activations = [x] # list to store all the activations, layer by layer zs = [] # list to store all the z vectors, layer by layer for b, w in zip(self.biases, self.weights): z = np.dot(w, activation)+b zs.append(z) activation = sigmoid(z) activations.append(activation) # backward pass delta = (self.cost).delta(zs[-1], activations[-1], y) nabla_b[-1] = delta nabla_w[-1] = np.dot(delta, activations[-2].transpose()) # Note that the variable l in the loop below is used a little # differently to the notation in Chapter 2 of the book. Here, # l = 1 means the last layer of neurons, l = 2 is the # second-last layer, and so on. It's a renumbering of the # scheme in the book, used here to take advantage of the fact # that Python can use negative indices in lists. for l in xrange(2, self.num_layers): z = zs[-l] sp = sigmoid_prime(z) delta = np.dot(self.weights[-l+1].transpose(), delta) * sp nabla_b[-l] = delta nabla_w[-l] = np.dot(delta, activations[-l-1].transpose()) return (nabla_b, nabla_w) def accuracy(self, data, convert=False): """Return the number of inputs in ``data`` for which the neural network outputs the correct result. The neural network's output is assumed to be the index of whichever neuron in the final layer has the highest activation. The flag ``convert`` should be set to False if the data set is validation or test data (the usual case), and to True if the data set is the training data. The need for this flag arises due to differences in the way the results ``y`` are represented in the different data sets. In particular, it flags whether we need to convert between the different representations. It may seem strange to use different representations for the different data sets. Why not use the same representation for all three data sets? It's done for efficiency reasons -- the program usually evaluates the cost on the training data and the accuracy on other data sets. These are different types of computations, and using different representations speeds things up. More details on the representations can be found in mnist_loader.load_data_wrapper. """ if convert: results = [(np.argmax(self.feedforward(x)), np.argmax(y)) for (x, y) in data] else: results = [(np.argmax(self.feedforward(x)), y) for (x, y) in data] return sum(int(x == y) for (x, y) in results) def total_cost(self, data, lmbda, convert=False): """Return the total cost for the data set ``data``. The flag ``convert`` should be set to False if the data set is the training data (the usual case), and to True if the data set is the validation or test data. See comments on the similar (but reversed) convention for the ``accuracy`` method, above. """ cost = 0.0 for x, y in data: a = self.feedforward(x) if convert: y = vectorized_result(y) cost += self.cost.fn(a, y)/len(data) cost += 0.5*(lmbda/len(data))*sum( np.linalg.norm(w)**2 for w in self.weights) return cost def save(self, filename): """Save the neural network to the file ``filename``.""" data = {"sizes": self.sizes, "weights": [w.tolist() for w in self.weights], "biases": [b.tolist() for b in self.biases], "cost": str(self.cost.__name__)} f = open(filename, "w") json.dump(data, f) f.close() #### Loading a Network def load(filename): """Load a neural network from the file ``filename``. Returns an instance of Network. """ f = open(filename, "r") data = json.load(f) f.close() cost = getattr(sys.modules[__name__], data["cost"]) net = Network(data["sizes"], cost=cost) net.weights = [np.array(w) for w in data["weights"]] net.biases = [np.array(b) for b in data["biases"]] return net #### Miscellaneous functions def vectorized_result(j): """Return a 10-dimensional unit vector with a 1.0 in the j'th position and zeroes elsewhere. This is used to convert a digit (0...9) into a corresponding desired output from the neural network. """ e = np.zeros((10, 1)) e[j] = 1.0 return e def sigmoid(z): """The sigmoid function.""" return 1.0/(1.0+np.exp(-z)) def sigmoid_prime(z): """Derivative of the sigmoid function.""" return sigmoid(z)*(1-sigmoid(z))
对比初始化的不同方法:
#老方法 def large_weight_initializer(self): self.biases = [np.random.randn(y, 1) for y in self.sizes[1:]] self.weights = [np.random.randn(y, x) for x, y in zip(self.sizes[:-1], self.sizes[1:])] #默认方法,即新方法 def default_weight_initializer(self): self.biases = [np.random.randn(y, 1) for y in self.sizes[1:]] self.weights = [np.random.randn(y, x)/np.sqrt(x) for x, y in zip(self.sizes[:-1], self.sizes[1:])]
对于Cost函数:
老的cost function(老的二次cost)class QuadraticCost(object): @staticmethod def fn(a, y): return 0.5*np.linalg.norm(a-y)**2 @staticmethod def delta(z, a, y): return (a-y) * sigmoid_prime(z)
新的cost function
class CrossEntropyCost(object): @staticmethod def fn(a, y): return np.sum(np.nan_to_num(-y*np.log(a)-(1-y)*np.log(1-a))) @staticmethod def delta(z, a, y): return (a-y)
为什么把cost实现在一个类里面而不是一个function?
计算cost有两个作用:
1. 衡量网络输出的值和理想预期值的匹配程度
2. 在用backprogapation计算偏导数的时候, 需要计算
运行network2.py测试
#coding=utf-8 # @Author: yangenneng # @Time: 2018-01-26 14:10 # @Abstract: import mnist_loader training_data, validation_data, test_data = mnist_loader.load_data_wrapper() import network2 net = network2.Network([784, 30, 10], cost=network2.CrossEntropyCost) # net.large_weight_initializer() net.SGD(training_data, 30, 10, 0.5, evaluation_data=validation_data, lmbda = 5.0, monitor_evaluation_cost=True, monitor_evaluation_accuracy=True, monitor_training_cost=True, monitor_training_accuracy=True)
准确率最高能达到:96.32%,相同参数下比之前的network.py要高
F:\python-2.7.13.amd64\Anaconda2-4.2.0-Windows-x86_64\python.exe D:/Python/PyCharm-WorkSpace/DeepLearning_Advanced/GradientDescent/regularization.py Epoch 0 training complete Cost on training data: 0.472008148311 Accuracy on training data: 47147 / 50000 Cost on evaluation data: 0.789319047524 Accuracy on evaluation data: 9441 / 10000 Epoch 1 training complete Cost on training data: 0.453797259886 Accuracy on training data: 47422 / 50000 Cost on evaluation data: 0.868820923816 Accuracy on evaluation data: 9488 / 10000 Epoch 2 training complete Cost on training data: 0.426263459796 Accuracy on training data: 47697 / 50000 Cost on evaluation data: 0.892428620085 Accuracy on evaluation data: 9518 / 10000 Epoch 3 training complete Cost on training data: 0.397153218406 Accuracy on training data: 48015 / 50000 Cost on evaluation data: 0.902996376529 Accuracy on evaluation data: 9541 / 10000 Epoch 4 training complete Cost on training data: 0.404934334043 Accuracy on training data: 48036 / 50000 Cost on evaluation data: 0.925197703694 Accuracy on evaluation data: 9556 / 10000 Epoch 5 training complete Cost on training data: 0.40452328887 Accuracy on training data: 48058 / 50000 Cost on evaluation data: 0.931829691071 Accuracy on evaluation data: 9552 / 10000 Epoch 6 training complete Cost on training data: 0.380791629252 Accuracy on training data: 48299 / 50000 Cost on evaluation data: 0.930382092998 Accuracy on evaluation data: 9584 / 10000 Epoch 7 training complete Cost on training data: 0.42682006464 Accuracy on training data: 47853 / 50000 Cost on evaluation data: 0.973889340064 Accuracy on evaluation data: 9534 / 10000 Epoch 8 training complete Cost on training data: 0.381931709493 Accuracy on training data: 48297 / 50000 Cost on evaluation data: 0.943710722991 Accuracy on evaluation data: 9604 / 10000 Epoch 9 training complete Cost on training data: 0.394124099343 Accuracy on training data: 48116 / 50000 Cost on evaluation data: 0.960483778054 Accuracy on evaluation data: 9558 / 10000 Epoch 10 training complete Cost on training data: 0.377963935828 Accuracy on training data: 48358 / 50000 Cost on evaluation data: 0.947445697355 Accuracy on evaluation data: 9570 / 10000 Epoch 11 training complete Cost on training data: 0.395500020434 Accuracy on training data: 48210 / 50000 Cost on evaluation data: 0.966927012594 Accuracy on evaluation data: 9567 / 10000 Epoch 12 training complete Cost on training data: 0.39226235435 Accuracy on training data: 48261 / 50000 Cost on evaluation data: 0.968052489284 Accuracy on evaluation data: 9567 / 10000 Epoch 13 training complete Cost on training data: 0.368915410634 Accuracy on training data: 48463 / 50000 Cost on evaluation data: 0.950452371475 Accuracy on evaluation data: 9619 / 10000 Epoch 14 training complete Cost on training data: 0.424900909029 Accuracy on training data: 47966 / 50000 Cost on evaluation data: 0.99555005664 Accuracy on evaluation data: 9521 / 10000 Epoch 15 training complete Cost on training data: 0.363736970943 Accuracy on training data: 48457 / 50000 Cost on evaluation data: 0.937241548632 Accuracy on evaluation data: 9623 / 10000 Epoch 16 training complete Cost on training data: 0.430628036017 Accuracy on training data: 47885 / 50000 Cost on evaluation data: 1.00785325697 Accuracy on evaluation data: 9503 / 10000 Epoch 17 training complete Cost on training data: 0.371276835138 Accuracy on training data: 48419 / 50000 Cost on evaluation data: 0.947896956145 Accuracy on evaluation data: 9610 / 10000 Epoch 18 training complete Cost on training data: 0.389298019413 Accuracy on training data: 48308 / 50000 Cost on evaluation data: 0.962151955008 Accuracy on evaluation data: 9592 / 10000 Epoch 19 training complete Cost on training data: 0.369310821264 Accuracy on training data: 48415 / 50000 Cost on evaluation data: 0.948146748962 Accuracy on evaluation data: 9614 / 10000 Epoch 20 training complete Cost on training data: 0.396054214492 Accuracy on training data: 48253 / 50000 Cost on evaluation data: 0.981207740108 Accuracy on evaluation data: 9588 / 10000 Epoch 21 training complete Cost on training data: 0.370809138665 Accuracy on training data: 48523 / 50000 Cost on evaluation data: 0.955209722303 Accuracy on evaluation data: 9585 / 10000 Epoch 22 training complete Cost on training data: 0.393853695598 Accuracy on training data: 48281 / 50000 Cost on evaluation data: 0.975714066174 Accuracy on evaluation data: 9575 / 10000 Epoch 23 training complete Cost on training data: 0.36334991212 Accuracy on training data: 48452 / 50000 Cost on evaluation data: 0.945567032029 Accuracy on evaluation data: 9612 / 10000 Epoch 24 training complete Cost on training data: 0.367186324806 Accuracy on training data: 48475 / 50000 Cost on evaluation data: 0.940873719939 Accuracy on evaluation data: 9618 / 10000 Epoch 25 training complete Cost on training data: 0.362830867428 Accuracy on training data: 48494 / 50000 Cost on evaluation data: 0.936882103762 Accuracy on evaluation data: 9632 / 10000 Epoch 26 training complete Cost on training data: 0.364483067411 Accuracy on training data: 48488 / 50000 Cost on evaluation data: 0.945528985147 Accuracy on evaluation data: 9607 / 10000 Epoch 27 training complete Cost on training data: 0.385661385381 Accuracy on training data: 48277 / 50000 Cost on evaluation data: 0.966490614403 Accuracy on evaluation data: 9562 / 10000 Epoch 28 training complete Cost on training data: 0.361879049238 Accuracy on training data: 48503 / 50000 Cost on evaluation data: 0.946364445193 Accuracy on evaluation data: 9611 / 10000 Epoch 29 training complete Cost on training data: 0.359157709887 Accuracy on training data: 48529 / 50000 Cost on evaluation data: 0.943491269513 Accuracy on evaluation data: 9625 / 10000 Process finished with exit code 0
相关文章推荐
- 机器学习深度学习基础笔记(2)——梯度下降之手写数字识别算法实现
- 深度学习笔记5torch实现mnist手写数字识别
- 深度学习-CNN卷积神经网络使用TensorFlow框架实现MNIST手写数字识别
- 深度学习-Cross-Entropy Cost函数来实现MNIST手写数字识别
- 深度学习- 用Torch实现MNIST手写数字识别
- 神经网络与深度学习 使用Python实现基于梯度下降算法的神经网络和自制仿MNIST数据集的手写数字分类可视化程序 web版本
- CNN:人工智能之神经网络算法进阶优化,六种不同优化算法实现手写数字识别逐步提高,应用案例自动驾驶之捕捉并识别周围车牌号—Jason niu
- 深度学习笔记(四)用Torch实现MNIST手写数字识别
- 深度学习-传统神经网络使用TensorFlow框架实现MNIST手写数字识别
- tensorflow 学习专栏(五):在mnist数据集上使用tensorflow实现临近算法(Nearest-Neighbor)进行手写数字识别
- Python实现深度学习之-神经网络识别手写数字(更新中,更新日期:2017-07-12)
- 深度学习- 用Torch实现MNIST手写数字识别
- 学习KNN(二)KNN算法手写数字识别的OpenCV实现
- 深度学习-灰度平均值算法和支持向量机算法(SVM)进行手写数字识别
- 【深度学习】笔记3_caffe自带的第一个例子,Mnist手写数字识别所使用的LeNet网络模型的详细解释
- 5 机器学习实践之手写数字识别 - 最终实现版本(97%识别率)
- 深度学习第三天: LeNet在Python实现Mnist手写数字.md
- 用MXnet入门实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别
- knn-2 利用knn算法实现手写数字识别
- 用MXnet入门实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别