您的位置:首页 > 编程语言 > Python开发

Python学习(机器学习_多变量线性回归)

2017-01-07 16:27 267 查看
接着上一篇,习题1第二个练习(linear regression with mutiple variables),两个变量(房子面积和房间数量),以下代码通过两个方法

通过scale feature利用梯度下降法求theta

直接代入推导出来的theta公式

通过验算测试参数,可以得知两个方法预测结果相同

#Part1:Load Data
print "Loading data...\n"
f=open('C:\Python27\machinelearning\ex1data2.txt')
data=[]
for line in f:
data.append(map(float,line.split(",")))
m=len(data)
import numpy as np
#将数据由列表形式转化为矩阵
data=np.asmatrix(data)
#提取每一行数据
X=data[:,0]
N=data[:,1]
y=data[:,2]
X=np.hstack((X,N))
#水平合并后X包含两个变量:房间面积和数量
print "First 10 examples from the dataset:\n"
print "X=\n",X[0:10]
print "y=\n",y[0:10]

#Part 2: Gradient Descent
#将数据参数规则化
print "Normalizing Features...\n"

#定义变量参数规则化函数——参考(http://sobuhu.com/ml/2012/12/29/normalization-regularization.html)
def featureNormalize(X):
X_norm=X
m=len(X)
import numpy as np
#初始化均值和方差
mu=np.zeros([1,X.shape[1]])
sigma=np.zeros([1,X.shape[1]])
mu=np.matrix((np.mean(X[:,0]),np.mean(X[:,1])))#均值
#(x-mu)/sigma即将变量规则化
sigma=np.matrix((np.std(X[:,0]),np.std(X[:,1])))
for i in range(0,m):
X_norm[i,:]=(X[i,:]-mu)/sigma
return mu,sigma,X_norm

[mu,sigma,X]=featureNormalize(X)
#Add intercept term to X
one=np.ones((m,1))
X=np.hstack((one,X))
alpha=0.3#定义theta变化的系数(取值要合适)
num_iters=100#定义迭代的次数
theta=np.zeros((3,1))#初始化theta三行一列数组

#定义代价函数
def computeCostMulti(X,y,theta):
J=0
import numpy as np
m=len(y)
J=np.sum(np.multiply((X*theta-y),(X*theta-y)))/(2*m)
return J

#定义梯度函数
def gradientDescentMulti(X,y,theta,alpha,num_iters,m):
import numpy as np
n=X.shape[1]
J_history=[]
for i in range(0,num_iters):
H=X*theta#X为矩阵,*在此处是合适的,如果是数组,则必须np.dot(X,theta)
T=np.zeros((n,1))#初始化偏导数的数值
#迭代累加求最合适的偏导数值
for i in range(0,m):
T=T+((H[i]-y[i])*X[i,:]).T
theta=theta-(alpha*T)/m
J_history.append(computeCostMulti(X,y,theta))
return theta,J_history

[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters,m)
import matplotlib.pyplot as plt
x=[]
for i in range(1,101):
x.append(i)
plt.plot(x, J_history,'r')
plt.xlabel("Number of iteration")
plt.ylabel("Cost J")

print "Theta computed from gradient descent:\n "
print "theta=\n",theta
print "To estimate the price of a 1650 sq-ft,3 br house\n"
#x1参数标准规则化
x1=np.matrix([1650,3])
x1=(x1-mu)/sigma
X1=np.matrix([1,x1[0,0],x1[0,1]])
price=X1*theta
print "Price=",price

#Part 3: Normal Equations
print"Soving with nornal equations\n"
#You need to redefine the X,y
X=data[:,0]
N=data[:,1]
y=data[:,2]
X=np.hstack((X,N))
one=np.ones((m,1))
X=np.hstack((one,X))

def normEquation(X,y):
import numpy as np
theta=np.linalg.inv(X.T*X)*X.T*y#套用theta的推导公式
return theta

theta=normEquation(X,y)
print "theta=\n",theta
print "To estimate the price of a 1650 sq-ft,3 br house\n"
price=np.matrix([1,1650,3])*theta
print "price=\n",price


result:

Loading data…

First 10 examples from the dataset:

X=

[[ 2.10400000e+03 3.00000000e+00]

[ 1.60000000e+03 3.00000000e+00]

[ 2.40000000e+03 3.00000000e+00]

[ 1.41600000e+03 2.00000000e+00]

[ 3.00000000e+03 4.00000000e+00]

[ 1.98500000e+03 4.00000000e+00]

[ 1.53400000e+03 3.00000000e+00]

[ 1.42700000e+03 3.00000000e+00]

[ 1.38000000e+03 3.00000000e+00]

[ 1.49400000e+03 3.00000000e+00]]

y=

[[ 399900.]

[ 329900.]

[ 369000.]

[ 232000.]

[ 539900.]

[ 299900.]

[ 314900.]

[ 198999.]

[ 212000.]

[ 242500.]]

Normalizing Features…

Theta computed from gradient descent:

theta=

[[ 340412.65957447]

[ 109447.75525931]

[ -6578.31364383]]

To estimate the price of a 1650 sq-ft,3 br house

Price= [[ 293081.47339913]]

Soving with nornal equations

theta=

[[ 89597.9095428 ]

[ 139.21067402]

[ -8738.01911233]]

To estimate the price of a 1650 sq-ft,3 br house

price=

[[ 293081.46433489]]

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python 机器学习