您的位置:首页 > 其它

[LA] Lipschitz continuous gradient

2016-01-19 07:15 260 查看
Definitions

Summary

1. Definitions

∇f(x) is L- Lipschitz continuous, then we have

(1) ∥∇f(x)−∇f(y)∥2≤L∥x−y∥2(note that this does not assume convexity of f(x))

(2) L2xTx−f(x) is convex ( if dom(f) is convex )

(3) ∇2f(x)≤L⋅I ( if f(x) is twice differentiable )

(4) f(y)≤f(x)+∇f(x)⋅(y−x)+L2∥y−x∥22 ( if f(x) is convex)

(一) (1) to (2): From the equivalent definition of convexity, that

f(x) is convex iff (∇f(x)−∇f(y))T(x−y)≥0 and dom(f) is convex.

We just need to prove that

[(L⋅x−∇f(x))−(L⋅y−∇f(y))]T(x−y)≥0

[(L⋅x−∇f(x))−(L⋅y−∇f(y))]T(x−y)=L⋅∥x−y∥22−(∇f(x)−∇f(y))T(x−y)≥L⋅∥x−y∥22−∥∇f(x)−∇f(y)∥∥(x−y)∥≥L⋅∥x−y∥22−L⋅∥x−y∥22=0

(二)(2) to (3):

From the equivalent definition of convexity that

f(x) is convex iff ∇2f(x)≥0 if it is twice differentiable.

So

∇2(L2xTx−f(x))=L⋅I−∇2f(x)≥0

So

∇2f(x)≤L⋅I

(三)(3) to (4) ( f(x) does not need to be convex):

From (3), L⋅I−∇2f(x) is positive definite matrix. Then for any ω, we have

ωT(L⋅I−∇2f(x))ω≥0

i.e.

ω∇2f(x)ω≤L∥ω∥22

Then from the taylor decomposition, we have

f(y)=f(x)+∇f(x)⋅(y−x)+(y−x)T∇2f(z)(y−x)≤f(x)+∇f(x)⋅(y−x)+L∥y−x∥22

(四)(4) to (1):( f(x) needs to be convex)

Let y←x+t(∇f(y)−∇f(x)) plug in the equality:

f(x+t(∇f(y)−∇f(x)))≤f(x)+t∇f(x)⋅(∇f(y)−∇f(x))+Lt22∥∇f(y)−∇f(x)∥22

from the convexity of f(x), we have

f(x+t(∇f(y)−∇f(x)))≥f(y)+∇f(y)⋅(x−y+t(∇f(y)−∇f(x)))

combining these two equalities together,

f(y)−f(x)+∇f(y)⋅(x−y)+t∥∇f(y)−∇f(x)∥22≤Lt22∥∇f(y)−∇f(x)∥22

adding

f(x)−f(y)−∇f(y)⋅(x−y)≤L2∥y−x∥22

on both sides,

t∥∇f(y)−∇f(x)∥22≤Lt22∥∇f(y)−∇f(x)∥22+L2∥y−x∥22

Letting t=1L

⇒∥∇f(x)−∇f(y)∥22≤L2∥x−y∥22

2. Summary

From last section, we know that

If f(x) is convex and has Lipschitz derivative, which is equivalent to all these four conditions.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  lipschitz