您的位置:首页 > 其它

scikit-learn linearRegression 1.1.10 逻辑回归

2017-07-11 23:49 459 查看
逻辑回归形如其名,是一个线性分类模型而不是回归模型。逻辑回归在文献中也称为logit回归、最大熵分类(MaxEnt) 或者 log-linear classifier。 在这个模型中,描述单次可能结果输出概率使用 logistic
function 来建模。

scikit-learn中逻辑回归的实现为 
LogisticRegression
 类。它可以拟合含L2或者L1正则化项的多类逻辑回归问题。

作为一个优化问题,二分类L2 通过下方的代价函数来惩罚逻辑回归:



类似的,L1 正则化逻辑回归解决下述的优化问题:



LogisticRegression

4000
 中的实现是solver
“liblinear” (一个扩展的C++ library,LIBLINEAR), “newton-cg”, “lbfgs” and “sag”。

“lbfgs” 和 “newton-cg” 只支持L2罚项,并且对于一些高维数据收敛非常快。L1罚项产生稀疏预测的权重。

“liblinear” 使用了基于Liblinear的坐标下降法(CD)。对于F1罚项, 
sklearn.svm.l1_min_c
 允许计算C的下界以获得一个非”null”
的 模型(所有特征权重为0)。这依赖于非常棒的一个库 LIBLINEAR library ,用在scikit-learn中。 然而,CD算法在liblinear中的实现无法学习一个真正的多维(多类)的模型。反而,最优问题被分解为
“one-vs-rest” 多个二分类问题来解决多分类。 由于底层是这样实现的,所以使用了该库的 
LogisticRegression
 类就可以作为多类分类器了。

LogisticRegression
 使用
“lbfgs” 或者 “newton-cg” 程序 来设置 multi_class 为 “multinomial”,则该类学习 了一个真正的多类逻辑回归模型,也就是说这种概率估计应该比默认 “one-vs-rest” 设置要更加准确。但是 “lbfgs”, “newton-cg” 和 “sag” 程序无法优化 含L1罚项的模型,所以”multinomial” 的设置无法学习稀疏模型。

“sag” 程序使用了随机平均梯度下降( Stochastic Average Gradient descent [3])。它无法解决多分类问题,而且对于含L2罚项的模型有局限性。
然而在超大数据集下计算要比其他程序快很多,当样本数量和特征数量都非常大的时候。

简单概括下,可以按照以下规则来选择solver:
CaseSolver
Small dataset or L1 penalty“liblinear”
Multinomial loss“lbfgs” or newton-cg”
Large dataset“sag”
对于超大数据集,你同样可以考虑使用带log损失的 
SGDClassifier


Examples:
L1
Penalty and Sparsity in Logistic Regression
Path
with L1- Logistic Regression

Differences from liblinear:
There might be a difference in the scores obtained between 
LogisticRegression
 with 
solver=liblinear
 or 
LinearSVC
 and
the external liblinear library directly, when 
fit_intercept=False
 and
the fit 
coef_
 (or) the data to be predicted are zeroes.
This is because for the sample(s) with 
decision_function
zero, 
LogisticRegression
 and 
LinearSVC
 predict
the negative class, while liblinear predicts the positive class. Note that a model with 
fit_intercept=False
 and
having many samples with 
decision_function
 zero, is likely
to be a underfit, bad model and you are advised to set 
fit_intercept=True
 and
increase the intercept_scaling.

Note
 

Feature selection with sparse logistic regression

A logistic regression with L1 penalty yields sparse models, and can thus be used to perform feature selection, as detailed in 基于L1的特征选择(L1-based
feature selection).

LogisticRegressionCV
 实现了一个内建的交叉验证来寻找最优的参数C的逻辑回归模型。”newton-cg”,”sag”
和 ”lbfgs” 程序在高维稠密数据上计算更快,原因在于warm-starting.对于多类问题,如果 multi_class 选项设置为 “ovr” ,那么最优的C从每个类别中获得,如果 multi_class 选项设置为 ”multinomial” ,那么最优的C通过最小化交叉熵损失得到。

Examples:
logistic回归中的L1罚和稀疏性
对不同的C值使用L1和L2惩罚时的稀疏性(零系数百分比)的比较,我们可以看出C的大值给模型带来了更多的自由度。相反,较小的C值限制了模型的更多。在L1处罚的情况下,这会导致稀疏的解决方案。

我们把数字8x8图像可分为两类:0-4对5-9。可视化显示了不同C模型的系数。





Script output:

C=100.00
Sparsity with L1 penalty: 6.25%
score with L1 penalty: 0.9104
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.9098
C=1.00
Sparsity with L1 penalty: 10.94%
score with L1 penalty: 0.9098
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.9093
C=0.01
Sparsity with L1 penalty: 85.94%
score with L1 penalty: 0.8614
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.8915


Python source code: 
plot_logistic_l1_l2_sparsity.py


print(__doc__)

# Authors: Alexandre Gramfort <alexandre.gramfort@inria.fr>
#          Mathieu Blondel <mathieu@mblondel.org>
#          Andreas Mueller <amueller@ais.uni-bonn.de>
# License: BSD 3 clause

import numpy as np
import matplotlib.pyplot as plt

from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

digits = datasets.load_digits()

X, y = digits.data, digits.target
X = StandardScaler().fit_transform(X)

# classify small against large digits
y = (y > 4).astype(np.int)

# Set regularization parameter
for i, C in enumerate((100, 1, 0.01)):
# turn down tolerance for short training time
clf_l1_LR = LogisticRegression(C=C, penalty='l1', tol=0.01)
clf_l2_LR = LogisticRegression(C=C, penalty='l2', tol=0.01)
clf_l1_LR.fit(X, y)
clf_l2_LR.fit(X, y)

coef_l1_LR = clf_l1_LR.coef_.ravel()
coef_l2_LR = clf_l2_LR.coef_.ravel()

# coef_l1_LR contains zeros due to the
# L1 sparsity inducing norm

sparsity_l1_LR = np.mean(coef_l1_LR == 0) * 100
sparsity_l2_LR = np.mean(coef_l2_LR == 0) * 100

print("C=%.2f" % C)
print("Sparsity with L1 penalty: %.2f%%" % sparsity_l1_LR)
print("score with L1 penalty: %.4f" % clf_l1_LR.score(X, y))
print("Sparsity with L2 penalty: %.2f%%" % sparsity_l2_LR)
print("score with L2 penalty: %.4f" % clf_l2_LR.score(X, y))

l1_plot = plt.subplot(3, 2, 2 * i + 1)
l2_plot = plt.subplot(3, 2, 2 * (i + 1))
if i == 0:
l1_plot.set_title("L1 penalty")
l2_plot.set_title("L2 penalty")

l1_plot.imshow(np.abs(coef_l1_LR.reshape(8, 8)), interpolation='nearest',
cmap='binary', vmax=1, vmin=0)
l2_plot.imshow(np.abs(coef_l2_LR.reshape(8, 8)), interpolation='nearest',
cmap='binary', vmax=1, vmin=0)
plt.text(-8, 3, "C = %.2f" % C)

l1_plot.set_xticks(())
l1_plot.set_yticks(())
l2_plot.set_xticks(())
l2_plot.set_yticks(())

plt.show()


L1 logistic回归路径

Computes path on IRIS dataset.



Script output:

Computing regularization path ...
This took  0:00:00.147946


Python source code: 
plot_logistic_path.py


print(__doc__)

# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr>
# License: BSD 3 clause

from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt

from sklearn import linear_model
from sklearn import datasets
from sklearn.svm import l1_min_c

iris = datasets.load_iris()
X = iris.data
y = iris.target

X = X[y != 2]
y = y[y != 2]

X -= np.mean(X, 0)

###############################################################################
# Demo path functions

cs = l1_min_c(X, y, loss='log') * np.logspace(0, 3)

print("Computing regularization path ...")
start = datetime.now()
clf = linear_model.LogisticRegression(C=1.0, penalty='l1', tol=1e-6)
coefs_ = []
for c in cs:
clf.set_params(C=c)
clf.fit(X, y)
coefs_.append(clf.coef_.ravel().copy())
print("This took ", datetime.now() - start)

coefs_ = np.array(coefs_)
plt.plot(np.log10(cs), coefs_)
ymin, ymax = plt.ylim()
plt.xlabel('log(C)')
plt.ylabel('Coefficients')
plt.title('Logistic Regression Path')
plt.axis('tight')
plt.show()
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐