您的位置:首页 > 其它

Machine Learning Week 3

2015-11-01 22:12 411 查看
Linear Regression cant be used for Classification Problems
Linear Regression isnt Working Well for Regression Problem

Logistic Regression Model

Decision Boundary

Cost Function and Gradient Descent for Logistic Regression

Multiclass Classification
One Vs All

Problem of Overfitting
How to solve overfitting - Regularisation

Cost Function and Gradient Descent With Regularization

Normal Equation

Regularized Logistic Regression

Weekly Matlab Exercise

Linear Regression can’t be used for Classification Problems

Linear Regression isn’t Working Well for Regression Problem

An extra unusual point may affect all the linear function thus cause some error in the classification process.



Logistic Regression Model

Sigmoid Function or Logistic Function

hθ(x)=g(θTx)

g(Z)=11+e−Z



Decision Boundary

The predict is that if hθ(x)≥0.5,then y=1,and if hθ(x)<0.5,then y=0 which is also equivalent to that if θTx≥0,then y=1 and if θTx<
216cb
0,then y=0



Example:



Cost Function and Gradient Descent for Logistic Regression

Cost Function for Logistic Regression





Together will be:

J(θ)=1m∑i=1mCost(hθ(x(i)),y(i))

for linear regression:

Cost(hθ(x(i)),y(i))=12(hθ(x(i))−y(i))2

for logistic regression:

Cost(hθ(x),y)=−ylog(hθ(x)−(1−y)log(1−hθ(x))

Gradient descent of logistic regression is the same with linear regression

θj:=θj−α∑i=1m(hθ(x(i))−y(i))x(i)j



Several different way of optimization algorithm



Multiclass Classification

One Vs All



Problem of Overfitting

Under fitting - Just right - Over fitting



How to solve overfitting - Regularisation



Intuition of Regularisation - to make θ small



Shrink all the parameters - starts from θ1 not from θ0:

J(θ)=12m[∑i=1m(hθ(x(i))−y(i))2+λ∑i=1nθ2j]



Regularisation Parameters



But the new parameter λ used for regularisation can’t be too big otherwise will cause the all the θ too small (almost equals to 0) - underfitting



Cost Function and Gradient Descent With Regularization

Cost Function:

J(θ)=12m[∑i=1m(hθ(x(i))−y(i))2+λ∑i=1nθ2j]



Gradient Descent:

θ0:=θ0−α1m∑i=1m(hθ(x(i))−y(i))x(i)0

θj:=θj−α[1m∑i=1m(hθ(x(i))=y(i))x(i)j+λmθj(j=1,23...,n)



Normal Equation



The θ corresponding to global minimum:

θ=(XTX+λ[000010001.........])−1XTy

This will also make the original non-invertible matrix invertible



Regularized Logistic Regression

Cost Function:

J(θ)=−[1m∑i=1my(i)log(hθ(x(i)))+(1−y(i))log(1−hθ(x(i)))]+λ2m∑j=1nθ2j



Repeating the Gradient Descent:

θ0:=θ0−α1m∑i=1m(hθ(x(i))−y(i))x(i)o

θj:=θj−α1m∑i=1m(hθ(x(i))−y(i))x(i)j+λmθj

(j=1,2,3...),hθ(x)=11+e−θTX



Matlab syntax:

fminunc(costFunction)

function[jVal,gradient]=costFunction(X,y,θ)



Weekly Matlab Exercise

%sigmoid:

function g = sigmoid(z)
g = zeros(size(z));

g=1./(1+exp(-z))

end

%plotData:

function plotData(X, y)
figure; hold on;

pos=find(y==1);neg=find(y==0);
plot(X(pos,1),X(pos,2),'k+','LineWidth',2,'MarkerSize',7);
plot(X(neg,1),X(neg,2),'ko','MarkerFaceColor','y','MarkerSize',7);

hold off;
end

%costFunction(without regularisation):

function [J, grad] = costFunction(theta, X, y)
m = length(y); % number of training examples
J = 0;
grad = zeros(size(theta));

J=1/m*(-log(sigmoid(X*theta))'*y-log(1-sigmoid(X*theta))'*(1-y));
grad=1/m*((sigmoid(X*theta)-y)'*X)';

end

%predict:

function p = predict(theta, X)
m = size(X, 1); % Number of training examples
p = zeros(m, 1);

p=round(1./(1+exp(-X*theta)))

end

%costFunction(with regularisation):

function [J, grad] = costFunctionReg(theta, X, y, lambda)
m = length(y); % number of training examples
J = 0;
grad = zeros(size(theta));

theta2=theta(2:size(theta,1),1)
J=1/m*(-y'*log(sigmoid(X*theta))-(1.-y)'*log(1-sigmoid(X*theta)))+lambda./(2*m)*(theta2'*theta2)
grad1=(1/m*(sigmoid(X*theta)-y)'*X)'
theta1=lambda/m*theta
theta1(1,1)=0
grad=grad1+theta1

end

%use the fminunc function:

initial_theta = zeros(size(X, 2), 1);
% Set regularization parameter lambda to 1 (you should vary this)
lambda = 1;
% Set Options
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Optimize
[theta, J, exit_flag] = ...
fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  机器学习