Linear Decoders with Autoencoders编程代码整理
2014-06-11 14:46
316 查看
编程练习:linear decoder excise.m
%% CS294A/CS294W Linear Decoder Exercise
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% linear decoder exericse. For this exercise, you will only need to modify
% the code in sparseAutoencoderLinearCost.m. You will not need to modify
% any code in this file.
%%======================================================================
%% STEP 0: Initialization
% Here we initialize some parameters used for the exercise.
imageChannels = 3; % 颜色通道的数量;
patchDim = 8; % 小块的维数;
numPatches = 100000; % 小块的数量;
visibleSize = patchDim * patchDim * imageChannels; % 输入单元的数量;
outputSize = visibleSize; % 输出单元的数量;
hiddenSize = 400; % 隐藏层的数量;
sparsityParam = 0.035; % 隐藏单元所需的平均激活;
lambda = 3e-3; % 权重衰减参数;
beta = 5; % 稀疏惩罚项的权重;
epsilon = 0.1; % epsilon for ZCA whitening
%%======================================================================
%% STEP 1: Create and modify sparseAutoencoderLinearCost.m to use a linear decoder,
% and check gradients
% You should copy sparseAutoencoderCost.m from your earlier exercise
% and rename it to sparseAutoencoderLinearCost.m.
% Then you need to rename the function from sparseAutoencoderCost to
% sparseAutoencoderLinearCost, and modify it so that the sparse autoencoder
% uses a linear decoder instead. Once that is done, you should check
% your gradients to verify that they are correct.
% NOTE: Modify sparseAutoencoderCost first!
% To speed up gradient checking, we will use a reduced network and some
% dummy patches
debugHiddenSize = 5;
debugvisibleSize = 8;
patches = rand([8 10]);%随机产生8*10的矩阵;
theta = initializeParameters(debugHiddenSize, debugvisibleSize);
[cost, grad] = sparseAutoencoderLinearCost(theta, debugvisibleSize, debugHiddenSize, ...
lambda, sparsityParam, beta, ...
patches);
% Check gradients
numGrad = computeNumericalGradient( @(x) sparseAutoencoderLinearCost(x, debugvisibleSize, debugHiddenSize, ...
lambda, sparsityParam, beta, ...
patches), theta);
% Use this to visually compare the gradients side by side
disp([numGrad grad]);
diff = norm(numGrad-grad)/norm(numGrad+grad);
% Should be small. In our implementation, these values are usually less than 1e-9.
disp(diff);
assert(diff < 1e-9, 'Difference too large. Check your gradient computation again');
% NOTE: Once your gradients check out, you should run step 0 again to
% reinitialize the parameters
%}
%%======================================================================
%% STEP 2: Learn features on small patches
% In this step, you will use your sparse autoencoder (which now uses a
% linear decoder) to learn features on small patches sampled from related
% images.
%% STEP 2a: Load patches
% In this step, we load 100k patches sampled from the STL10 dataset and
% visualize them. Note that these patches have been scaled to [0,1]
load stlSampledPatches.mat
displayColorNetwork(patches(:, 1:100));
%% STEP 2b: Apply preprocessing
% In this sub-step, we preprocess the sampled patches, in particular,
% ZCA whitening them.
%
% In a later exercise on convolution and pooling, you will need to replicate
% exactly the preprocessing steps you apply to these patches before
% using the autoencoder to learn features on them. Hence, we will save the
% ZCA whitening and mean image matrices together with the learned features
% later on.
% Subtract mean patch (hence zeroing the mean of the patches)
meanPatch = mean(patches, 2); %求均值;
patches = bsxfun(@minus, patches, meanPatch);%每一维都减去均值;
% Apply ZCA whitening
sigma = patches * patches' / numPatches;
[u, s, v] = svd(sigma);
ZCAWhite = u * diag(1 ./ sqrt(diag(s) + epsilon)) * u';
patches = ZCAWhite * patches;
displayColorNetwork(patches(:, 1:100));
%% STEP 2c: Learn features
% You will now use your sparse autoencoder (with linear decoder) to learn
% features on the preprocessed patches. This should take around 45 minutes.
theta = initializeParameters(hiddenSize, visibleSize);
% Use minFunc to minimize the function
addpath minFunc/
options = struct;%建立结构体;
options.Method = 'lbfgs';
options.maxIter = 400;
options.display = 'on';
[optTheta, cost] = minFunc( @(p) sparseAutoencoderLinearCost(p, ...
visibleSize, hiddenSize, ...
lambda, sparsityParam, ...
beta, patches), ...
theta, options);
% Save the learned features and the preprocessing matrices for use in
% the later exercise on convolution and pooling
fprintf('Saving learned features and preprocessing matrices...\n');
save('STL10Features.mat', 'optTheta', 'ZCAWhite', 'meanPatch');
fprintf('Saved\n');
%% STEP 2d: Visualize learned features
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
displayColorNetwork( (W*ZCAWhite)');
编程练习:sparse Autoencoder linear cost.m
function [cost,grad,features] = sparseAutoencoderLinearCost(theta, visibleSize, hiddenSize, ...
lambda, sparsityParam, beta, data)
% -------------------- YOUR CODE HERE --------------------
% Instructions:
% Copy sparseAutoencoderCost in sparseAutoencoderCost.m from your
% earlier exercise onto this file, renaming the function to
% sparseAutoencoderLinearCost, and changing the autoencoder to use a
% linear decoder.
% -------------------- YOUR CODE HERE --------------------
% The input theta is a vector because minFunc only deal with vectors. In
% this step, we will convert theta to matrix format such that they follow
% the notation in the lecture notes.
W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize);
b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);
% Loss and gradient variables (your code needs to compute these values)
m = size(data, 2);%样本点的个数
%% ---------- YOUR CODE HERE --------------------------------------
% Instructions: Compute the loss for the Sparse Autoencoder and gradients
% W1grad, W2grad, b1grad, b2grad
%
% Hint: 1) data(:,i) is the i-th example
% 2) your computation of loss and gradients should match the size
% above for loss, W1grad, W2grad, b1grad, b2grad
% z2 = W1 * x + b1
% a2 = f(z2)
% z3 = W2 * a2 + b2
% h_Wb = a3 = f(z3)
z2 = W1 * data + repmat(b1, [1, m]);
a2 = sigmoid(z2);
z3 = W2 * a2 + repmat(b2, [1, m]);
a3 = z3;
rhohats = mean(a2,2);
rho = sparsityParam;
KLsum = sum(rho * log(rho ./ rhohats) + (1-rho) * log((1-rho) ./ (1-rhohats)));
squares = (a3 - data).^2;
squared_err_J = (1/2) * (1/m) * sum(squares(:));
weight_decay_J = (lambda/2) * (sum(W1(:).^2) + sum(W2(:).^2));
sparsity_J = beta * KLsum;
cost = squared_err_J + weight_decay_J + sparsity_J;%损失函数值
% delta3 = -(data - a3) .* fprime(z3);
% but fprime(z3) = a3 * (1-a3)
delta3 = -(data - a3);
beta_term = beta * (- rho ./ rhohats + (1-rho) ./ (1-rhohats));
delta2 = ((W2' * delta3) + repmat(beta_term, [1,m]) ) .* a2 .* (1-a2);
W2grad = (1/m) * delta3 * a2' + lambda * W2;
b2grad = (1/m) * sum(delta3, 2);
W1grad = (1/m) * delta2 * data' + lambda * W1;
b1grad = (1/m) * sum(delta2, 2);
%-------------------------------------------------------------------
% Convert weights and bias gradients to a compressed form
% This step will concatenate and flatten all your gradients to a vector
% which can be used in the optimization method.
grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];
end
%-------------------------------------------------------------------
% We are giving you the sigmoid function, you may find this function
% useful in your computation of the loss and the gradients.
function sigm = sigmoid(x)
sigm = 1 ./ (1 + exp(-x));
end
%% CS294A/CS294W Linear Decoder Exercise
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% linear decoder exericse. For this exercise, you will only need to modify
% the code in sparseAutoencoderLinearCost.m. You will not need to modify
% any code in this file.
%%======================================================================
%% STEP 0: Initialization
% Here we initialize some parameters used for the exercise.
imageChannels = 3; % 颜色通道的数量;
patchDim = 8; % 小块的维数;
numPatches = 100000; % 小块的数量;
visibleSize = patchDim * patchDim * imageChannels; % 输入单元的数量;
outputSize = visibleSize; % 输出单元的数量;
hiddenSize = 400; % 隐藏层的数量;
sparsityParam = 0.035; % 隐藏单元所需的平均激活;
lambda = 3e-3; % 权重衰减参数;
beta = 5; % 稀疏惩罚项的权重;
epsilon = 0.1; % epsilon for ZCA whitening
%%======================================================================
%% STEP 1: Create and modify sparseAutoencoderLinearCost.m to use a linear decoder,
% and check gradients
% You should copy sparseAutoencoderCost.m from your earlier exercise
% and rename it to sparseAutoencoderLinearCost.m.
% Then you need to rename the function from sparseAutoencoderCost to
% sparseAutoencoderLinearCost, and modify it so that the sparse autoencoder
% uses a linear decoder instead. Once that is done, you should check
% your gradients to verify that they are correct.
% NOTE: Modify sparseAutoencoderCost first!
% To speed up gradient checking, we will use a reduced network and some
% dummy patches
debugHiddenSize = 5;
debugvisibleSize = 8;
patches = rand([8 10]);%随机产生8*10的矩阵;
theta = initializeParameters(debugHiddenSize, debugvisibleSize);
[cost, grad] = sparseAutoencoderLinearCost(theta, debugvisibleSize, debugHiddenSize, ...
lambda, sparsityParam, beta, ...
patches);
% Check gradients
numGrad = computeNumericalGradient( @(x) sparseAutoencoderLinearCost(x, debugvisibleSize, debugHiddenSize, ...
lambda, sparsityParam, beta, ...
patches), theta);
% Use this to visually compare the gradients side by side
disp([numGrad grad]);
diff = norm(numGrad-grad)/norm(numGrad+grad);
% Should be small. In our implementation, these values are usually less than 1e-9.
disp(diff);
assert(diff < 1e-9, 'Difference too large. Check your gradient computation again');
% NOTE: Once your gradients check out, you should run step 0 again to
% reinitialize the parameters
%}
%%======================================================================
%% STEP 2: Learn features on small patches
% In this step, you will use your sparse autoencoder (which now uses a
% linear decoder) to learn features on small patches sampled from related
% images.
%% STEP 2a: Load patches
% In this step, we load 100k patches sampled from the STL10 dataset and
% visualize them. Note that these patches have been scaled to [0,1]
load stlSampledPatches.mat
displayColorNetwork(patches(:, 1:100));
%% STEP 2b: Apply preprocessing
% In this sub-step, we preprocess the sampled patches, in particular,
% ZCA whitening them.
%
% In a later exercise on convolution and pooling, you will need to replicate
% exactly the preprocessing steps you apply to these patches before
% using the autoencoder to learn features on them. Hence, we will save the
% ZCA whitening and mean image matrices together with the learned features
% later on.
% Subtract mean patch (hence zeroing the mean of the patches)
meanPatch = mean(patches, 2); %求均值;
patches = bsxfun(@minus, patches, meanPatch);%每一维都减去均值;
% Apply ZCA whitening
sigma = patches * patches' / numPatches;
[u, s, v] = svd(sigma);
ZCAWhite = u * diag(1 ./ sqrt(diag(s) + epsilon)) * u';
patches = ZCAWhite * patches;
displayColorNetwork(patches(:, 1:100));
%% STEP 2c: Learn features
% You will now use your sparse autoencoder (with linear decoder) to learn
% features on the preprocessed patches. This should take around 45 minutes.
theta = initializeParameters(hiddenSize, visibleSize);
% Use minFunc to minimize the function
addpath minFunc/
options = struct;%建立结构体;
options.Method = 'lbfgs';
options.maxIter = 400;
options.display = 'on';
[optTheta, cost] = minFunc( @(p) sparseAutoencoderLinearCost(p, ...
visibleSize, hiddenSize, ...
lambda, sparsityParam, ...
beta, patches), ...
theta, options);
% Save the learned features and the preprocessing matrices for use in
% the later exercise on convolution and pooling
fprintf('Saving learned features and preprocessing matrices...\n');
save('STL10Features.mat', 'optTheta', 'ZCAWhite', 'meanPatch');
fprintf('Saved\n');
%% STEP 2d: Visualize learned features
W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);
b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
displayColorNetwork( (W*ZCAWhite)');
编程练习:sparse Autoencoder linear cost.m
function [cost,grad,features] = sparseAutoencoderLinearCost(theta, visibleSize, hiddenSize, ...
lambda, sparsityParam, beta, data)
% -------------------- YOUR CODE HERE --------------------
% Instructions:
% Copy sparseAutoencoderCost in sparseAutoencoderCost.m from your
% earlier exercise onto this file, renaming the function to
% sparseAutoencoderLinearCost, and changing the autoencoder to use a
% linear decoder.
% -------------------- YOUR CODE HERE --------------------
% The input theta is a vector because minFunc only deal with vectors. In
% this step, we will convert theta to matrix format such that they follow
% the notation in the lecture notes.
W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
W2 = reshape(theta(hiddenSize*visibleSize+1:2*hiddenSize*visibleSize), visibleSize, hiddenSize);
b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
b2 = theta(2*hiddenSize*visibleSize+hiddenSize+1:end);
% Loss and gradient variables (your code needs to compute these values)
m = size(data, 2);%样本点的个数
%% ---------- YOUR CODE HERE --------------------------------------
% Instructions: Compute the loss for the Sparse Autoencoder and gradients
% W1grad, W2grad, b1grad, b2grad
%
% Hint: 1) data(:,i) is the i-th example
% 2) your computation of loss and gradients should match the size
% above for loss, W1grad, W2grad, b1grad, b2grad
% z2 = W1 * x + b1
% a2 = f(z2)
% z3 = W2 * a2 + b2
% h_Wb = a3 = f(z3)
z2 = W1 * data + repmat(b1, [1, m]);
a2 = sigmoid(z2);
z3 = W2 * a2 + repmat(b2, [1, m]);
a3 = z3;
rhohats = mean(a2,2);
rho = sparsityParam;
KLsum = sum(rho * log(rho ./ rhohats) + (1-rho) * log((1-rho) ./ (1-rhohats)));
squares = (a3 - data).^2;
squared_err_J = (1/2) * (1/m) * sum(squares(:));
weight_decay_J = (lambda/2) * (sum(W1(:).^2) + sum(W2(:).^2));
sparsity_J = beta * KLsum;
cost = squared_err_J + weight_decay_J + sparsity_J;%损失函数值
% delta3 = -(data - a3) .* fprime(z3);
% but fprime(z3) = a3 * (1-a3)
delta3 = -(data - a3);
beta_term = beta * (- rho ./ rhohats + (1-rho) ./ (1-rhohats));
delta2 = ((W2' * delta3) + repmat(beta_term, [1,m]) ) .* a2 .* (1-a2);
W2grad = (1/m) * delta3 * a2' + lambda * W2;
b2grad = (1/m) * sum(delta3, 2);
W1grad = (1/m) * delta2 * data' + lambda * W1;
b1grad = (1/m) * sum(delta2, 2);
%-------------------------------------------------------------------
% Convert weights and bias gradients to a compressed form
% This step will concatenate and flatten all your gradients to a vector
% which can be used in the optimization method.
grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];
end
%-------------------------------------------------------------------
% We are giving you the sigmoid function, you may find this function
% useful in your computation of the loss and the gradients.
function sigm = sigmoid(x)
sigm = 1 ./ (1 + exp(-x));
end
相关文章推荐
- UFLDL学习笔记6(Linear Decoders with Autoencoders)
- UFLDL Tutorial_Linear Decoders with Autoencoders
- sparse autoencoder 编程代码整理
- Exercise:Learning color features with Sparse Autoencoders 代码示例
- 论文笔记(3)-Extracting and Composing Robust Features with Denoising Autoencoders
- UFLDL Exercise:Learning color features with Sparse Autoencoders
- UFLDL教程: Exercise:Learning color features with Sparse Autoencoders
- 阿翔编程学-整理一些Javascript代码
- UFLFL Exercise: Learning color features with Sparse Autoencoders
- UFLDL教程Exercise答案(7):Learning color features with Sparse Autoencoders
- PCA and Whitening编程代码整理
- 【DeepLearning】Exercise:Learning color features with Sparse Autoencoders
- Extracting and composing robust features with denosing autoencoders 论文
- 卷积神经Extracting and Composing Robust Features with Denoising Autoencoders
- Occlusion-free Face Alignment: Deep Regression Networks Coupled with De-corrupt AutoEncoders
- 易语言发送信息代码数字指令编程整理
- Linux C/C++ 编程 (一)—— indent 工具(代码整理工具)
- 深度学习笔记6:Learning color features with Sparse Autoencoders
- TCP--UDP常用代码(socket编程--网上搜索自己整理的)
- UFLDL教程答案(7):Exercise:Learning color features with Sparse Autoencoders