分类问题之逻辑回归
2018-03-16 17:45
363 查看
逻辑回归即把线性回归的值通过sigmoid函数映射到(0,1)之间,sigmoid函数值大于0.5预测为正,小于0.5预测为负,分割面由梯度下降法求得的参数所确定。
LR为什么要用sigmoid函数?
LR认为函数其概率服从伯努利分布,将其写成指数族分布的形式,能够推导出sigmoid函数的形式。
指数族分布的后验分布为sigmoid函数。
1. sigmoid 函数连续,单调递增
2. p′=p*(1−p)计算sigmoid函数的导数非常的快速
LR损失函数可以表示为:
LR(梯度下降)Python代码如下:import numpy as np
import matplotlib.pyplot as plt
def loadDataSet():
data = np.loadtxt("testSet.txt")
dataMat = data[:, :-1]
b = np.ones(len(dataMat))
dataMat = np.c_[b, dataMat] # 在第一列前面添加全1列
labelMat = data[:, -1]
return dataMat, labelMat
def sigmoid(inX):
return 1.0 / (1 + np.exp(-inX))
def gradAscent(dataMatIN, classLabels): # dataMatIN存放的是100*3矩阵,第一列全部为1
# print(classLabels)
# labelMat = classLabels.transpose()
# print(labelMat)
# labelMat = np.mat(classLabels)
# print(labelMat)
dataMatIN = np.mat(dataMatIN)
labelMat = np.mat(classLabels).transpose() # 标签向量转化为列向量
# print(labelMat)
m, n = np.shape(dataMatIN)
alpha = 0.001
maxCycles = 500
weights = np.ones((n, 1))
# print(np.shape(dataMatIN))
# print(np.shape(weights))
for k in range(maxCycles):
h = sigmoid(dataMatIN * weights) # theta0+theta1*x1+theta3*x3
error = labelMat - h
weights = weights + alpha * dataMatIN.transpose() * error
return weights
def plotBestFit(weights):
dataMat, labelMat = loadDataSet()
dataArr = np.array(dataMat)
n = np.shape(dataArr)[0]
xcord1 = []
ycord1 = []
xcord2 = []
ycord2 = []
for i in range(n):
if labelMat[i] == 1:
xcord1.append(dataMat[i][1]), ycord1.append(dataMat[i][2])
else:
xcord2.append(dataMat[i][1]), ycord2.append(dataMat[i][2])
plt.scatter(xcord1, ycord1, c="r")
plt.scatter(xcord2, ycord2, c="g")
x = np.arange(-3.0, 3.0, 0.1)
y = (-weights[0] - weights[1] * x) / weights[2]
plt.plot(x, y)
plt.show()
if __name__ == "__main__":
dataMat, labelMat = loadDataSet()
weights = gradAscent(dataMat, labelMat)
print(weights)
plotBestFit(weights.getA())
LR为什么要用sigmoid函数?
LR认为函数其概率服从伯努利分布,将其写成指数族分布的形式,能够推导出sigmoid函数的形式。
指数族分布的后验分布为sigmoid函数。
1. sigmoid 函数连续,单调递增
2. p′=p*(1−p)计算sigmoid函数的导数非常的快速
LR损失函数可以表示为:
LR(梯度下降)Python代码如下:import numpy as np
import matplotlib.pyplot as plt
def loadDataSet():
data = np.loadtxt("testSet.txt")
dataMat = data[:, :-1]
b = np.ones(len(dataMat))
dataMat = np.c_[b, dataMat] # 在第一列前面添加全1列
labelMat = data[:, -1]
return dataMat, labelMat
def sigmoid(inX):
return 1.0 / (1 + np.exp(-inX))
def gradAscent(dataMatIN, classLabels): # dataMatIN存放的是100*3矩阵,第一列全部为1
# print(classLabels)
# labelMat = classLabels.transpose()
# print(labelMat)
# labelMat = np.mat(classLabels)
# print(labelMat)
dataMatIN = np.mat(dataMatIN)
labelMat = np.mat(classLabels).transpose() # 标签向量转化为列向量
# print(labelMat)
m, n = np.shape(dataMatIN)
alpha = 0.001
maxCycles = 500
weights = np.ones((n, 1))
# print(np.shape(dataMatIN))
# print(np.shape(weights))
for k in range(maxCycles):
h = sigmoid(dataMatIN * weights) # theta0+theta1*x1+theta3*x3
error = labelMat - h
weights = weights + alpha * dataMatIN.transpose() * error
return weights
def plotBestFit(weights):
dataMat, labelMat = loadDataSet()
dataArr = np.array(dataMat)
n = np.shape(dataArr)[0]
xcord1 = []
ycord1 = []
xcord2 = []
ycord2 = []
for i in range(n):
if labelMat[i] == 1:
xcord1.append(dataMat[i][1]), ycord1.append(dataMat[i][2])
else:
xcord2.append(dataMat[i][1]), ycord2.append(dataMat[i][2])
plt.scatter(xcord1, ycord1, c="r")
plt.scatter(xcord2, ycord2, c="g")
x = np.arange(-3.0, 3.0, 0.1)
y = (-weights[0] - weights[1] * x) / weights[2]
plt.plot(x, y)
plt.show()
if __name__ == "__main__":
dataMat, labelMat = loadDataSet()
weights = gradAscent(dataMat, labelMat)
print(weights)
plotBestFit(weights.getA())
相关文章推荐
- CS229学习笔记之分类问题与逻辑回归
- 斯坦福大学机器学习笔记——逻辑回归、高级优化以及多分类问题
- 逻辑回归的MATLAB实现(二分类问题)
- 逻辑回归之从Logistic回归到sigmoid与softmax的分类问题
- 基于逻辑回归的二分类问题
- 逻辑回归算法——解决分类问题
- exercise 3 (逻辑斯蒂回归实现多分类问题)
- 【HowTo ML】分类问题->逻辑回归
- stanford coursera 机器学习编程作业 exercise 3(逻辑回归实现多分类问题)
- Ng深度学习笔记2 -逻辑回归、分类问题、牛顿迭代
- 逻辑回归的MATLAB实现(二分类问题)
- 分类-逻辑斯谛回归
- 初学ML笔记N0.1——线性回归,分类与逻辑斯蒂回归,通用线性模型
- 初学ML笔记N0.1——线性回归,分类与逻辑斯蒂回归,通用线性模型
- 逻辑回归与过拟合问题
- [机器学习]逻辑回归,Logistic regression |分类,Classification
- Coursera机器学习(Andrew Ng)笔记:回归与分类问题
- 逻辑回归模型(Logistic Regression, LR)--分类
- 斯坦福机器学习-第三周(分类,逻辑回归,过度拟合及解决方法)
- 广告推荐系统-逻辑回归问题导出