您的位置:首页 > 其它

分类问题之逻辑回归

2018-03-16 17:45 363 查看
逻辑回归即把线性回归的值通过sigmoid函数映射到(0,1)之间,sigmoid函数值大于0.5预测为正,小于0.5预测为负,分割面由梯度下降法求得的参数所确定。
LR为什么要用sigmoid函数?
LR认为函数其概率服从伯努利分布,将其写成指数族分布的形式,能够推导出sigmoid函数的形式。
指数族分布的后验分布为sigmoid函数。
1. sigmoid 函数连续,单调递增
2. p′=p*(1−p)计算sigmoid函数的导数非常的快速









LR损失函数可以表示为:



LR(梯度下降)Python代码如下:import numpy as np
import matplotlib.pyplot as plt
def loadDataSet():
data = np.loadtxt("testSet.txt")
dataMat = data[:, :-1]
b = np.ones(len(dataMat))
dataMat = np.c_[b, dataMat] # 在第一列前面添加全1列
labelMat = data[:, -1]
return dataMat, labelMat

def sigmoid(inX):
return 1.0 / (1 + np.exp(-inX))

def gradAscent(dataMatIN, classLabels): # dataMatIN存放的是100*3矩阵,第一列全部为1
# print(classLabels)
# labelMat = classLabels.transpose()
# print(labelMat)
# labelMat = np.mat(classLabels)
# print(labelMat)
dataMatIN = np.mat(dataMatIN)
labelMat = np.mat(classLabels).transpose() # 标签向量转化为列向量
# print(labelMat)
m, n = np.shape(dataMatIN)
alpha = 0.001
maxCycles = 500
weights = np.ones((n, 1))
# print(np.shape(dataMatIN))
# print(np.shape(weights))
for k in range(maxCycles):
h = sigmoid(dataMatIN * weights) # theta0+theta1*x1+theta3*x3
error = labelMat - h
weights = weights + alpha * dataMatIN.transpose() * error
return weights

def plotBestFit(weights):
dataMat, labelMat = loadDataSet()
dataArr = np.array(dataMat)
n = np.shape(dataArr)[0]

xcord1 = []
ycord1 = []
xcord2 = []
ycord2 = []
for i in range(n):
if labelMat[i] == 1:
xcord1.append(dataMat[i][1]), ycord1.append(dataMat[i][2])
else:
xcord2.append(dataMat[i][1]), ycord2.append(dataMat[i][2])
plt.scatter(xcord1, ycord1, c="r")
plt.scatter(xcord2, ycord2, c="g")
x = np.arange(-3.0, 3.0, 0.1)
y = (-weights[0] - weights[1] * x) / weights[2]
plt.plot(x, y)
plt.show()
if __name__ == "__main__":
dataMat, labelMat = loadDataSet()
weights = gradAscent(dataMat, labelMat)
print(weights)
plotBestFit(weights.getA())
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  逻辑回归