您的位置:首页 > 其它

吴恩达Machine Learning week 3 review答案: Logistic Regression

2017-09-25 10:50 2823 查看
[Y]Our estimate for P(y=0|x;θ) is
0.6.

[Y]Our estimate for P(y=1|x;θ) is
0.4.

Our estimate for P(y=1|x;θ) is
0.6.

Our estimate for P(y=0|x;θ) is
0.4.

Which of the following are true? Check all that apply.

J(θ) will be a convex function, so gradient descent should converge to the global minimum.

[b]【right】 Adding polynomial features (e.g., instead using hθ(x)=g(θ0+θ1x1+θ2x2+θ3x21+θ4x1x2+θ5x22) ) could increase how well we can fit the training data.[/b]

【right】The positive and negative examples cannot be separated using a straight line. So, gradient descent will fail to converge.

 WRONG Because the positive and negative examples cannot be separated using a straight line, linear regression will perform as well as logistic regression on this data.

θj:=θj−α1m∑mi=1(hθ(x(i))−y(i))x(i) (simultaneously
update for all j).

[Y]
θj:=θj−α1m∑mi=1(11+e−θTx(i)−y(i))x(i)j (simultaneously
update for all j).

θ:=θ−α1m∑mi=1(θTx−y(i))x(i).

θj:=θj−α1m∑mi=1(hθ(x(i))−y(i))x(i)j (simultaneously
update for all j).

[Y]The cost function J(θ) for
logistic regression trained with m≥1 examples
is always greater than or equal to zero.

For logistic regression, sometimes gradient descent will converge to a local minimum (and fail to find the global minimum). This is the reason we prefer more advanced optimization algorithms such as fminunc (conjugate gradient/BFGS/L-BFGS/etc).

Since we train one classifier when there are two classes, we train two classifiers when there are three classes (and we do one-vs-all classification).

[Y]The one-vs-all technique allows you to use logistic regression for problems in which each y(i) comes
from a fixed, discrete set of values.

Figure:



Figure:



Figure:



[Y]Figure:

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: