您的位置:首页 > 其它

【机器学习】回归分析、过拟合、分类

2017-10-23 09:53 465 查看
一、Linear Regression

线性回归是相对简单的一种,表达式如下



from sklearn.base import clone
sgd_reg = SGDRegressor(n_iter=1, warm_start=True, penalty=None,
learning_rate="constant", eta0=0.0005)
minimum_val_error = float("inf")
best_epoch = None
best_model = None
for epoch in range(1000):
sgd_reg.fit(X_train_poly_scaled, y_train) # continues where it left off
y_val_predict = sgd_reg.predict(X_val_poly_scaled)
val_error = mean_squared_error(y_val_predict, y_val)
if val_error < minimum_val_error:
minimum_val_error = val_error
best_epoch = epoch
best_model = clone(sgd_reg)


View Code
五、Logistic Regression(可用作分类)

[b]



(使用sigmod函数,y在(0,1)之间)


定义cost function,由于p在(0,1)之间,故最前面加一个符号,保证代价始终为正的。p值越大,整体cost越小,预测的越对









不存在解析解,故用偏导数计算

以Iris花的种类划分为例

import matplotlib.pyplot as plt
from sklearn import datasets
iris = datasets.load_iris()
print(list(iris.keys()))
# ['DESCR', 'data', 'target', 'target_names', 'feature_names']
X = iris["data"][:, 3:] # petal width
y = (iris["target"] == 2).astype(np.int) # 1 if Iris-Virginica, else 0

from sklearn.linear_model import LogisticRegression
log_reg = LogisticRegression()
log_reg.fit(X, y)
X_new = np.linspace(0, 3, 1000).reshape(-1, 1)
# estimated probabilities for flowers with petal widths varying from 0 to 3 cm:
y_proba = log_reg.predict_proba(X_new)

plt.plot(X_new, y_proba[:, 1], "g-", label="Iris-Virginica")
plt.plot(X_new, y_proba[:, 0], "b--", label="Not Iris-Virginica")
plt.show()
# + more Matplotlib code to make the image look pretty




六、Softmax Regression

可以用做多分类



使用交叉熵


  

X = iris["data"][:, (2, 3)] # petal length, petal width
y = iris["target"]
softmax_reg = LogisticRegression(multi_class="multinomial",solver="lbfgs", C=10)
softmax_reg.fit(X, y)

print(softmax_reg.predict([[5, 2]]))
# array([2])
print(softmax_reg.predict_proba([[5, 2]]))
# array([[ 6.33134078e-07, 5.75276067e-02, 9.42471760e-01]])
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐