您的位置:首页 > 其它

逻辑斯蒂回归参数梯度推导

2015-04-14 18:44 239 查看

逻辑斯蒂回归参数梯度推导

令w^=[wTb]T\hat w = [w^T b]^T,对于最大划似然函数l(w^,b)=∑i=1My(i)log(σ(w^x(i)))+(1−y(i))log(1−σ(w^x(i)))l(\hat w,b)=\sum_{i=1}^M y^{(i)}\log(\sigma(\hat wx^{(i)}))+(1-y^{(i)})\log(1-\sigma(\hat wx^{(i)})) 注意是复合求导,其中sigmoid函数的导数为σ′(x)=σ(x)(1−σ(x))\sigma '(x)=\sigma(x)(1-\sigma(x)),因此计算w的第j个分量的梯度为∇wj===∑i=1My(i)1σ(w^x(i))σ(w^x(i))(1−σ(w^x(i)))x(i)j+(1−y(i))−11−σ(w^x(i))σ(w^x(i))(1−σ(w^x(i)))x(i)j∑i=1My(i)(1−σ(w^x(i)))x(i)j+(y(i)−1)σ(w^x(i))x(i)j∑i=1M(y(i)−σ(w^x(i)))x(i)j\begin{eqnarray*} \nabla w_j &=& \sum_{i=1}^M y^{(i)} \frac{1}{\sigma(\hat wx^{(i)})}\sigma(\hat wx^{(i)})(1-\sigma(\hat wx^{(i)}))x_j^{(i)}+(1-y^{(i)})\frac{-1}{1-\sigma(\hat wx^{(i)})}\sigma(\hat wx^{(i)})(1-\sigma(\hat wx^{(i)}))x_j^{(i)} \\ &=& \sum_{i=1}^M y^{(i)}(1-\sigma(\hat wx^{(i)}))x_j^{(i)}+(y^{(i)}-1)\sigma(\hat wx^{(i)})x_j^{(i)} \\ &=& \sum_{i=1}^M (y^{(i)}-\sigma(\hat wx^{(i)}))x_j^{(i)} \end{eqnarray*} 第j个分量的迭代公式为wj=wj+∇wjw_j = w_j + \nabla w_j。如果是随机梯度下降法,那么迭代公式为wj=wj+(y(i)−σ(w^x(i)))x(i)jw_j = w_j + (y^{(i)}-\sigma(\hat wx^{(i)}))x_j^{(i)},如果是批处理梯度下降,进一步推导为wj=wj+XT[Y−σ(Xw^(old))]w_j = w_j + X^T[Y-\sigma(X \hat w^{(old)})]。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: