Python学习(机器学习_多变量线性回归)
2017-01-07 16:27
267 查看
接着上一篇,习题1第二个练习(linear regression with mutiple variables),两个变量(房子面积和房间数量),以下代码通过两个方法
通过scale feature利用梯度下降法求theta
直接代入推导出来的theta公式
通过验算测试参数,可以得知两个方法预测结果相同
result:
Loading data…
First 10 examples from the dataset:
X=
[[ 2.10400000e+03 3.00000000e+00]
[ 1.60000000e+03 3.00000000e+00]
[ 2.40000000e+03 3.00000000e+00]
[ 1.41600000e+03 2.00000000e+00]
[ 3.00000000e+03 4.00000000e+00]
[ 1.98500000e+03 4.00000000e+00]
[ 1.53400000e+03 3.00000000e+00]
[ 1.42700000e+03 3.00000000e+00]
[ 1.38000000e+03 3.00000000e+00]
[ 1.49400000e+03 3.00000000e+00]]
y=
[[ 399900.]
[ 329900.]
[ 369000.]
[ 232000.]
[ 539900.]
[ 299900.]
[ 314900.]
[ 198999.]
[ 212000.]
[ 242500.]]
Normalizing Features…
Theta computed from gradient descent:
theta=
[[ 340412.65957447]
[ 109447.75525931]
[ -6578.31364383]]
To estimate the price of a 1650 sq-ft,3 br house
Price= [[ 293081.47339913]]
Soving with nornal equations
theta=
[[ 89597.9095428 ]
[ 139.21067402]
[ -8738.01911233]]
To estimate the price of a 1650 sq-ft,3 br house
price=
[[ 293081.46433489]]
通过scale feature利用梯度下降法求theta
直接代入推导出来的theta公式
通过验算测试参数,可以得知两个方法预测结果相同
#Part1:Load Data print "Loading data...\n" f=open('C:\Python27\machinelearning\ex1data2.txt') data=[] for line in f: data.append(map(float,line.split(","))) m=len(data) import numpy as np #将数据由列表形式转化为矩阵 data=np.asmatrix(data) #提取每一行数据 X=data[:,0] N=data[:,1] y=data[:,2] X=np.hstack((X,N)) #水平合并后X包含两个变量:房间面积和数量 print "First 10 examples from the dataset:\n" print "X=\n",X[0:10] print "y=\n",y[0:10] #Part 2: Gradient Descent #将数据参数规则化 print "Normalizing Features...\n" #定义变量参数规则化函数——参考(http://sobuhu.com/ml/2012/12/29/normalization-regularization.html) def featureNormalize(X): X_norm=X m=len(X) import numpy as np #初始化均值和方差 mu=np.zeros([1,X.shape[1]]) sigma=np.zeros([1,X.shape[1]]) mu=np.matrix((np.mean(X[:,0]),np.mean(X[:,1])))#均值 #(x-mu)/sigma即将变量规则化 sigma=np.matrix((np.std(X[:,0]),np.std(X[:,1]))) for i in range(0,m): X_norm[i,:]=(X[i,:]-mu)/sigma return mu,sigma,X_norm [mu,sigma,X]=featureNormalize(X) #Add intercept term to X one=np.ones((m,1)) X=np.hstack((one,X)) alpha=0.3#定义theta变化的系数(取值要合适) num_iters=100#定义迭代的次数 theta=np.zeros((3,1))#初始化theta三行一列数组 #定义代价函数 def computeCostMulti(X,y,theta): J=0 import numpy as np m=len(y) J=np.sum(np.multiply((X*theta-y),(X*theta-y)))/(2*m) return J #定义梯度函数 def gradientDescentMulti(X,y,theta,alpha,num_iters,m): import numpy as np n=X.shape[1] J_history=[] for i in range(0,num_iters): H=X*theta#X为矩阵,*在此处是合适的,如果是数组,则必须np.dot(X,theta) T=np.zeros((n,1))#初始化偏导数的数值 #迭代累加求最合适的偏导数值 for i in range(0,m): T=T+((H[i]-y[i])*X[i,:]).T theta=theta-(alpha*T)/m J_history.append(computeCostMulti(X,y,theta)) return theta,J_history [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters,m) import matplotlib.pyplot as plt x=[] for i in range(1,101): x.append(i) plt.plot(x, J_history,'r') plt.xlabel("Number of iteration") plt.ylabel("Cost J") print "Theta computed from gradient descent:\n " print "theta=\n",theta print "To estimate the price of a 1650 sq-ft,3 br house\n" #x1参数标准规则化 x1=np.matrix([1650,3]) x1=(x1-mu)/sigma X1=np.matrix([1,x1[0,0],x1[0,1]]) price=X1*theta print "Price=",price #Part 3: Normal Equations print"Soving with nornal equations\n" #You need to redefine the X,y X=data[:,0] N=data[:,1] y=data[:,2] X=np.hstack((X,N)) one=np.ones((m,1)) X=np.hstack((one,X)) def normEquation(X,y): import numpy as np theta=np.linalg.inv(X.T*X)*X.T*y#套用theta的推导公式 return theta theta=normEquation(X,y) print "theta=\n",theta print "To estimate the price of a 1650 sq-ft,3 br house\n" price=np.matrix([1,1650,3])*theta print "price=\n",price
result:
Loading data…
First 10 examples from the dataset:
X=
[[ 2.10400000e+03 3.00000000e+00]
[ 1.60000000e+03 3.00000000e+00]
[ 2.40000000e+03 3.00000000e+00]
[ 1.41600000e+03 2.00000000e+00]
[ 3.00000000e+03 4.00000000e+00]
[ 1.98500000e+03 4.00000000e+00]
[ 1.53400000e+03 3.00000000e+00]
[ 1.42700000e+03 3.00000000e+00]
[ 1.38000000e+03 3.00000000e+00]
[ 1.49400000e+03 3.00000000e+00]]
y=
[[ 399900.]
[ 329900.]
[ 369000.]
[ 232000.]
[ 539900.]
[ 299900.]
[ 314900.]
[ 198999.]
[ 212000.]
[ 242500.]]
Normalizing Features…
Theta computed from gradient descent:
theta=
[[ 340412.65957447]
[ 109447.75525931]
[ -6578.31364383]]
To estimate the price of a 1650 sq-ft,3 br house
Price= [[ 293081.47339913]]
Soving with nornal equations
theta=
[[ 89597.9095428 ]
[ 139.21067402]
[ -8738.01911233]]
To estimate the price of a 1650 sq-ft,3 br house
price=
[[ 293081.46433489]]
相关文章推荐
- Python学习(机器学习_线性回归)
- Python学习笔记之全局变量
- 我的python学习之路----传递命令行参数给脚本及获取环境变量
- python学习变量
- 斯坦福大学机器学习第四课“多变量线性回归(Linear Regression with Multiple Variables)”笔记
- Python 学习笔记(一)语句,变量,函数
- Python学习笔记之全局变量
- 机器学习实战:多变量线性回归的实现
- 机器学习实战:单变量线性回归的实现
- Coursera公开课笔记: 斯坦福大学机器学习第四课“多变量线性回归(Linear Regression with Multiple Variables)”
- python 学习笔记 标识符和变量(3)
- Learn Python The Hard Way学习(13) - 参数,解包,变量
- Learn Python The Hard Way学习(5) - 更多的变量和打印
- Python学习(一)----变量与赋值
- Python学习笔记--变量和赋值
- Learn Python The Hard Way学习(4) - 变量和命名
- Learn Python The Hard Way学习(19) - 函数和变量
- 斯坦福大学机器学习第二课“单变量线性回归
- Python 学习笔记 -- 变量、元组、列表、字典和集合