您的位置：首页 > 其它

Convolutional neural networks(CNN) (七) Softmax Regression Exercise

2016-07-31 16:27 309 查看

{作为CNN学习入门的一部分，笔者在这里逐步给出UFLDL的各章节Exercise的个人代码实现，供大家参考指正}

理论部分可以在线参阅(页面最下方有中文选项)Softmax
Regression章节的内容。

Notes：

1. 笔者得到的MNIST数据集的名称与softmaxExercise.m里面的并不一致，读者做实验时也请注意可能需要修改。

2. Step 2: Implement softmaxCost
中存在可以Vectorization优化的部分，请注意参照示例代码片。

3. 笔者没有想到直观的将J(Theta)的Gradient完全Vectorization加速的方法，因此，对于Classes还是有一重Loop。

softmaxExercise.m :

%% CS294A/CS294W Softmax Exercise

% Instructions
% ------------
%
% This file contains code that helps you get started on the
% softmax exercise. You will need to write the softmax cost function
% in softmaxCost.m and the softmax prediction function in softmaxPred.m.
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
% (However, you may be required to do so in later exercises)

%%======================================================================
%% STEP 0: Initialise constants and parameters
%
% Here we define and initialise some constants which allow your code
% to be used more generally on any arbitrary input.
% We also initialise some parameters used for tuning the model.

inputSize = 28 * 28; % Size of input vector (MNIST images are 28x28)
numClasses = 10; % Number of classes (MNIST images fall into 10 classes)

lambda = 1e-4; % Weight decay parameter

%%======================================================================
%% STEP 1: Load data
%
% In this section, we load the input and output data.
% For softmax regression on MNIST pixels,
% the input data is the images, and
% the output data is the labels.
%

% Change the filenames if you've saved the files under different names
% On some platforms, the files might be saved as
% train-images.idx3-ubyte / train-labels.idx1-ubyte

images = loadMNISTImages('mnist/train-images.idx3-ubyte');
labels = loadMNISTLabels('mnist/train-labels.idx1-ubyte');
labels(labels==0) = 10; % Remap 0 to 10

inputData = images;

% For debugging purposes, you may wish to reduce the size of the input data
% in order to speed up gradient checking.
% Here, we create synthetic dataset using random data for testing

DEBUG = false; % Set DEBUG to true when debugging.
if DEBUG
inputSize = 8;
inputData = randn(8, 100);
labels = randi(10, 100, 1);
end

% Randomly initialise theta
theta = 0.005 * randn(numClasses * inputSize, 1);

%%======================================================================
%% STEP 2: Implement softmaxCost
%
% Implement softmaxCost in softmaxCost.m.

[cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, inputData, labels);

%%======================================================================
%% STEP 3: Gradient checking
%
% As with any learning algorithm, you should always check that your
% gradients are correct before learning the parameters.
%

if DEBUG
numGrad = computeNumericalGradient( @(x) softmaxCost(x, numClasses, ...
inputSize, lambda, inputData, labels), theta);

% Use this to visually compare the gradients side by side
disp([numGrad grad]);

% Compare numerically computed gradients with those computed analytically
diff = norm(numGrad-grad)/norm(numGrad+grad);
disp(diff);
% The difference should be small.
% In our implementation, these values are usually less than 1e-7.

% When your gradients are correct, congratulations!

% @6.3522e-10
end

%%======================================================================
%% STEP 4: Learning parameters
%
% Once you have verified that your gradients are correct,
% you can start training your softmax regression code using softmaxTrain
% (which uses minFunc).

options.maxIter = 100;
softmaxModel = softmaxTrain(inputSize, numClasses, lambda, ...
inputData, labels, options);

% Although we only use 100 iterations here to train a classifier for the
% MNIST data set, in practice, training for more iterations is usually
% beneficial.

%%======================================================================
%% STEP 5: Testing
%
% You should now test your model against the test images.
% To do this, you will first need to write softmaxPredict
% (in softmaxPredict.m), which should return predictions
% given a softmax model and the input data.

images = loadMNISTImages('mnist/t10k-images.idx3-ubyte');
labels = loadMNISTLabels('mnist/t10k-labels.idx1-ubyte');
labels(labels==0) = 10; % Remap 0 to 10

inputData = images;

% You will have to implement softmaxPredict in softmaxPredict.m
[pred] = softmaxPredict(softmaxModel, inputData);

acc = mean(labels(:) == pred(:));
fprintf('Accuracy: %0.3f%%\n', acc * 100);

% Accuracy is the proportion of correctly classified images
% After 100 iterations, the results for our implementation were:
%
% Accuracy: 92.200%
% My Accuracy: 92.640%
%
% If your values are too low (accuracy less than 0.91), you should check
% your code for errors, and make sure you are training on the
% entire data set of 60000 28x28 training images
% (unless you modified the loading code, this should be the case)
softmaxCost.m :

function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels)

% numClasses - the number of classes
% inputSize - the size N of the input vector
% lambda - weight decay parameter
% data - the N x M input matrix, where each column data(:, i) corresponds to
% a single test set
% labels - an M x 1 matrix containing the labels corresponding for the input data
%

% Unroll the parameters from theta
theta = reshape(theta, numClasses, inputSize);

numCases = size(data, 2);

groundTruth = full(sparse(labels, 1:numCases, 1));
% cost = 0;

thetagrad = zeros(numClasses, inputSize);

%% ---------- YOUR CODE HERE --------------------------------------
% Instructions: Compute the cost and gradient for softmax regression.
% You need to compute thetagrad and cost.
% The groundTruth matrix might come in handy.

M = theta * data; % M(r,c) is theta.T.r*x(:,c)
M = bsxfun(@minus, M, max(M, [], 1)); % Preventing overflows.
M = exp(M);
M = bsxfun(@rdivide, M, sum(M)); % Dividing al
d9ff
l elements in each column by their column sum.
J_theta = sum(sum(log(M).*groundTruth));
J_theta = -J_theta / numCases;

WeightDecay = lambda * sum(sum(theta.^2)) / 2;

cost = J_theta + WeightDecay;

M = groundTruth - M;

for i = 1:1:numClasses
thetagrad(i,:) = sum(bsxfun(@times, data, M(i,:)), 2); % Array multiply
end

thetagrad = -thetagrad/numCases + lambda * theta;

% ------------------------------------------------------------------
% Unroll the gradient matrices into a vector for minFunc
grad = [thetagrad(:)];
end

笔者在编写此段代码时，出现了三处错误：

1. M = exp(M);
被遗漏

2.M = log(M);
错误的赋值

3. thetagrad(i,:) = sum(bsxfun(@times, data, M(i,:)), 2);@times开始直接复制的原有代码 @plus

因此，Gradient Check是不可或缺的。

softmaxPredict.m :
function [pred] = softmaxPredict(softmaxModel, data)

% softmaxModel - model trained using softmaxTrain
% data - the N x M input matrix, where each column data(:, i) corresponds to
% a single test set
%
% Your code should produce the prediction matrix
% pred, where pred(i) is argmax_c P(y(c) | x(i)).

% Unroll the parameters from theta
theta = softmaxModel.optTheta; % this provides a numClasses x inputSize matrix
pred = zeros(1, size(data, 2));

%% ---------- YOUR CODE HERE --------------------------------------
% Instructions: Compute pred using theta assuming that the labels start
% from 1.

M = theta * data;
[argmax_c_value_Vec, argmax_c_index_Vec] = max(M, [], 1);
pred = argmax_c_index_Vec;

% ---------------------------------------------------------------------

end
实验结果：

198.781/60 = 3.313 mins

Accuracy = 92.640
%

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： CNN UFLDL

相关文章推荐

新的分享

章节导航