您的位置：首页 > 其它

吴恩达机器学习 EX1 作业第二部分多变量线性回归

2019-03-19 15:23 337 查看

2多变量线性回归

2.1作业介绍

在本部分中，您将使用多个变量实现线性回归来预测房价。假设你在卖房子，你想知道一个好的市场价格是多少。其中一种方法是首先收集最近售出房屋的信息，并建立一个房价模型。
文件ex1data2.txt(数据集请到网上自行下载)包含俄勒冈州波特兰市的房价训练集。第一栏是房子的大小(以平方英尺为单位)，第二栏是卧室的数量，第三栏是房子的价格

2.2 导入模块

import matplotlib.pyplot as plt
import numpy as np
from featureNormalize import * #正则化模块
from gradientDescent import * # 批量梯度下降模块
from normalEqn import * # 正规方程模块

2.3 导入数据

plt.ion()

# ===================== Part 1: Feature Normalization =====================
data = np.loadtxt('ex1data2.txt', delimiter=',', dtype=np.int64)
X = data[:, 0:2]
y = data[:, 2]
m = y.size

2.4 查看前10条训练样本和输出样本

# Print out some data points
print('First 10 examples from the dataset: ')
for i in range(0, 10):
print('x = {}, y = {}'.format(X[i], y[i]))

First 10 examples from the dataset:
x = [2104    3], y = 399900
x = [1600    3], y = 329900
x = [2400    3], y = 369000
x = [1416    2], y = 232000
x = [3000    4], y = 539900
x = [1985    4], y = 299900
x = [1534    3], y = 314900
x = [1427    3], y = 198999
x = [1380    3], y = 212000
x = [1494    3], y = 242500

2.5 正则化函数(featureNormalize.py)

import numpy as np

def feature_normalize(X):
n = X.shape[1]  # the number of features
X_norm = X
mu = np.zeros(n)
sigma = np.zeros(n)

mu = np.mean(X, axis=0) # 计算X轴方向样本的平均值
sigma = np.std(X, axis=0) # 计算X轴方向样本的标准差
X_norm = (X - mu) / sigma # 对样本进行正则化

return X_norm, mu, sigma

2.6 对样本进行标准化处理

a、标准化处理不包括偏置(bias)单元，标准化处理后再增加偏置单元。
b、标准化处理只处理训练样本，不对输出样本进行标准化处理

# Scale features and set them to zero mean
X, mu, sigma = feature_normalize(X)
X = np.c_[np.ones(m), X]  # Add a column of ones to X

2.7 用批量梯度下降算法计算代价值和更新theta

单变量批量梯度下降和多变量批量梯度下降算法相同，代价函数算法相同，详见ex1 第一部分相关内容

# Choose some alpha value
alpha = 0.03
num_iters = 400

# Init theta and Run Gradient Descent
theta = np.zeros(3)
theta, J_history = gradient_descent_multi(X, y, theta, alpha, num_iters)

2.8 绘制迭代训练过程代价值

# Plot the convergence graph
plt.figure()
plt.plot(np.arange(J_history.size), J_history)
plt.xlabel('Number of iterations')
plt.ylabel('Cost J')

Text(0,0.5,‘Cost J’)

2.9 打印批量梯度下降更新后的theta

# Display gradient descent's result
print('Theta computed from gradient descent : \n{}'.format(theta))

Theta computed from gradient descent :
[340410.91897274 109162.68848142  -6293.24735132]

2.10 用更新后的theta预测房价

正则化预测样本

# Estimate the price of a 1650 sq-ft, 3 br house
# ===================== Your Code Here =====================
# Recall that the first column of X is all-ones. Thus, it does
# not need to be normalized.
x_p = np.array([1650, 3])
x_p_nor = (x_p - mu) / sigma

预测样本加偏置单元(1)进行预测

price = np.dot(np.r_[1, x_p_nor], theta[:, np.newaxis]) # You should change this

打印预测房价

print('Predicted price of a 1650 sq-ft, 3 br house (using gradient descent) : %0.3f' % (price))

Predicted price of a 1650 sq-ft, 3 br house (using gradient descent) : 293142.433

2.11 正规方程计算theta

# Load data
data = np.loadtxt('ex1data2.txt', delimiter=',', dtype=np.int64)
X = data[:, 0:2]
y = data[:, 2]
m = y.size

# Add intercept term to X
X = np.c_[np.ones(m), X]

2.12 正规方程函数

只要特征变量的数目并不大，标准方程是一个很好的计算参数theta的替代方法。具体地说，只要特征变量数量小于一万，通常使用标准方程法，而不使用梯度下降法
正规方程公式如下：

import numpy as np

def normal_eqn(X, y):
theta = np.zeros((X.shape[1], 1))

theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

return theta

用正规方程计算theta

theta = normal_eqn(X, y)

# Display normal equation's result
print('Theta computed from the normal equations : \n{}'.format(theta))

Theta computed from the normal equations :
[89597.9095428    139.21067402 -8738.01911233]

用正规方程计算的theta预测房价，和批量梯度下降算法计算的theta预测房价差不多

# Estimate the price of a 1650 sq-ft, 3 br house
# ===================== Your Code Here =====================
price = np.dot(np.array([1, 1650, 3]), theta.T)

# ==========================================================

print('Predicted price of a 1650 sq-ft, 3 br house (using normal equations) : {:0.3f}'.format(price))

Predicted price of a 1650 sq-ft, 3 br house (using normal equations) : 293081.464

前一篇 EX1第一部分单变量线性回归
后一篇 EX2第一部分逻辑回归

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航

吴恩达机器学习 EX1 作业 第二部分多变量线性回归