UFLDL run_train.m
2016-08-08 21:55
302 查看
原来的函数调用minFunc进行全局最优解的求解,后来稍作修改,使得其能够通过手动的随机梯度下降,比较简单,有效果。。。
后来发现其是学习率可以设置为ab+t,t是迭代的轮数iteration,a和b是初始化的参数。
按照原来的训练方式进行训练:
后来发现其是学习率可以设置为ab+t,t是迭代的轮数iteration,a和b是初始化的参数。
按照原来的训练方式进行训练:
10000数据 minFunc 92.55% 60000数据 minFunc 96.41% Step Size below progTol test accuracy: 0.964100 train accuracy: 1.000000
% runs training procedure for supervised multilayer network % softmax output layer with cross entropy loss function %% setup environment % experiment information % a struct containing network layer sizes etc clear;clc;close all; ei = []; % add common directory to your path for % minfunc and mnist data helpers addpath ../common; addpath(genpath('../common/minFunc_2012/minFunc')); %% load mnist data [data_train, labels_train, data_test, labels_test] = load_preprocess_mnist(); %%mini_batch of whole dataset num_select = 100; data_train = data_train(:,1:num_select); labels_train = labels_train(1:num_select,:); %% populate ei with the network architecture to train % ei is a structure you can use to store hyperparameters of the network % the architecture specified below should produce 100% training accuracy % You should be able to try different network architectures by changing ei % only (no changes to the objective function code) % dimension of input features ei.input_dim = 784; % number of output classes ei.output_dim = 10; % sizes of all hidden layers and the output layer ei.layer_sizes = [256 ,128,256,128,ei.output_dim]; % scaling parameter for l2 weight regularization penalty ei.lambda = 1e-4; % which type of activation function to use in hidden layers % feel free to implement support for only the logistic sigmoid function ei.activation_fun = 'logistic'; %% setup random initial weights stack = initialize_weights(ei); params = stack2params(stack); %% setup minfunc options options = []; options.display = 'iter'; options.maxFunEvals = 1e6; options.Method = 'lbfgs'; options.MaxIter = 500; %% run training % [opt_params,opt_value,exitflag,output] = minFunc(@supervised_dnn_cost,... % params,options,ei, data_train, labels_train); lr = 0.1; sum_iter = 10000; cost_rec = []; figure; for iter = 1:sum_iter [ cost, grad, pred_prob] = supervised_dnn_cost( params, ei, data_train, labels_train); scale = (max(grad)-min(grad)) / (max(params)-min(params)); if scale > 1 grad = grad / scale; end params = params - lr * grad; cost_rec = [cost_rec , cost]; plot(1:iter,cost_rec,'r'); drawnow(); xlabel('iter'); ylabel('loss'); if mod(iter,10) == 0 [~, ~, pred] = supervised_dnn_cost( params, ei, data_test, [], true); [~,pred] = max(pred); acc_test = mean(pred'==labels_test); fprintf('test accuracy: %f\n', acc_test); [~, ~, pred] = supervised_dnn_cost( params, ei, data_train, [], true); [~,pred] = max(pred); acc_train = mean(pred'==labels_train); fprintf('train accuracy: %f\n', acc_train); end end %% compute accuracy on the test and train set [~, ~, pred] = supervised_dnn_cost( opt_params, ei, data_test, [], true); [~,pred] = max(pred); acc_test = mean(pred'==labels_test); fprintf('test accuracy: %f\n', acc_test); [~, ~, pred] = supervised_dnn_cost( opt_params, ei, data_train, [], true); [~,pred] = max(pred); acc_train = mean(pred'==labels_train); fprintf('train accuracy: %f\n', acc_train);
相关文章推荐
- Hélène的每一首歌都是经典,《 Ce train qui s’en va 》
- 网易train
- hdu 1023 Train Problem II 高精训练
- 1023:Train Problem II
- hdu 1023 Train Problem II
- HDU 1022 Train Problem I
- train2(H1023)
- 杭电OJ--1021 Train Problem I
- css3 train
- HDU1022 Train Problem I
- hdu 1022 Train Problem I<stl>
- hdu 1022 Train Problem I 解题报告
- traincascade.exe 出错:死循环在某个阶段
- Train Swapping From:UVa, 299
- UVA - 11456 Trainsorting DP
- Train Problem II
- usaco traini 5.2.2 Electric Fences 题解
- train
- Train Problem I
- HDU 1022 Train Problem I (数据结构 —— 栈)