您的位置：首页 > 其它

论文笔记-Augmented Lagrange Multiplier Method for Recovery of Low-Rank Matrices

2016-11-04 20:00 399 查看

论文题目：The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices

Abstract

1.Robust PCA问题： recovering a low-rank matrix with an unknown fraction of its entries being arbitrarily corrupted.

RPCA问题是一个凸优化问题：minimizes a combination of the nuclear norm and the L1-norm

2.本文提出一种更好更快的求解RPCA问题的算法：即用 Augmented Lagrange multipliers (ALM) 增广拉格朗日乘子法求解。

比之前用的APD（accelerated proximal gradient）快了5倍，精度也提高了，内存需求也减少了。

3.ALM也用于了 tensor completion

4.Matlab代码： http://perception.csl.illinois.edu/matrix-rank/home.html

Introduction

1.PCA assumes that the given high-dimensional data lie near a much lower-dimensional linear subspace

2.PCA的推导方式挺多的，常见的是方差角度的，这里是另一个角度：

D为 m*n 原数据，A为低秩矩阵，E为它们的差，r 远小于 min(m,n) 为目标子空间维数，目标函数中的 F 范数对应于数据被独立同分布高斯噪声腐蚀的假设。

问题可通过对 D 做 SVD ，取前 r 列向量，再投影。

3.当数据被加性高斯噪声腐蚀时，PCA得到效果很好，在噪声幅值不大的情况下，效果也不错。

如果 A 被任意腐蚀（即E任意大），PCA恢复出的 A 可以任意差。

4.RPCA:

[1]提出当 E 足够稀疏时（相对于A），通过求解如下凸优化问题，可以精确恢复出 A：

第一项为 A 的 nuclear norm （the sum of its singular values）,第二项为 L1 范数(the sum of the absolute value of matrix entries)，λ is a positive weighting parameter

RPCA对 large errors or outliers,gross corruption 效果也很好，背景建模的应用如下：

5. （2）再转化为一个可以看作普通凸优化问题，用任何 interior point solver 求解，推荐 CVX 工具箱 [2]

但是内点法在小矩阵上收敛很快，但是矩阵在 70*70 以上就很慢了；慢是由于内点法rely on second-order information of the objective function.

iterative thresholding （IT）方法求解：

对于目标函数中的 L1-norm 和：有 [3,4,5,6] 等；

对于目标函数中的 nuclear norm 有 [7] 。

[1] （[1] 同名有两个版本一个用的IT，一个用的[APG]）中用了 IT （结合了[3]和[7]）；但是收敛很慢，m=800 时，要跑 8 小时。

[8] 提出了2个算法：

The first one is an accelerated proximal gradient (APG) algorithm applied to the primal, which is a direct application of the FISTA framework introduced by [4], coupled with a fast continuation technique(Similar techniques
have been applied to the matrix completion problem by [9].); The second one is a gradient-ascent algorithm applied to the dual of the problem (2).
From simulations with matrices of dimension up to m = 1000, both methods are at least 50 times faster than the iterative thresholding method

6. 本文用 ALM 做 matrix recovery 。EALM Q-linear convergence speed while the APG is in theory only sub-linear. IALM required numbers of partial SVDs is significantly less, at least five times faster than APG. and its precision is also higher.

the number of non-zeros in E computed by IALM is much more accurate than that by APG,which often leave many small non zero terms in E.

Previous Algorithms for Matrix Recovery

1.The Iterative Thresholding Approach [1]

[1] 用 IT 求解 (2) 的松弛凸优化：

τ is a large positive scalar 后面加入的俩项对目标函数的影响就小了，使用拉格朗日乘子：

Then the IT approach updates A E Y iteratively ：

1.固定 Y ，最小化 L（A,E,Y）更新 A 和 E

2. 用 A+E=D 的限制来更新 Y （对Y求导，梯度下降）

soft-thresholding（shrinkage）operator：ε>0

由[7,3]可知：这里很重要，很多文章都用，一个求解核范数，一个求解L1范数。

USVT 为 W 的SVD

对（4）中 A , E 分别优化，把内积和F范数合并成F范数，然后用（6）求解，具体不推了，应该能看出来。

尽管IT非常简洁且理论证明完善，但需要迭代很多次才能收敛，Y的学习率也不容易确定。

The Accelerated Proximal Gradient Approach

APG理论见[4,10,11]，[4]介绍了 PG 到 APG 到 APGL 3个算法。

下面算法中 4,7 由公式（8）推出，注意对 A 求min时，E，Y的项都可以丢掉，对E求 min 时也一样，（8）中第2项和第3项可以和为（6）中两个公式的第二项的形式，然后再用（6）求解，就推出了下面算法：

The Dual Approach

先了解下对偶范数：

dual norm （对偶范数）：

sup{}是上确界，基本可以看成max，与max的区别就是，集合里可能没有上界值。

某范数的对偶范数的对偶范数就是它本身。

几个常见对偶范数：

1. F 范数的对偶范数还是 F 范数：

2. L1 范数的对偶范数是 L ∞ 范数（元素中绝对值最大值），下面这个算法也用到了，注意公式（10）的 max（）中第二个 L ∞范数就是对应的（2）中的 L1 范数，λ也对应（2）中λ！

3. 谱范数（由p-2范数诱导出的矩阵2范数，最大奇异值）的对偶范数是核范数，同样下面算法也用到了，公式（10）max（）中第一项

[8] 中除了APG外第二种方法是求解（2）的对偶问题：（FaLRTC 张量填充算法也用到了对偶范数），注意（10）中 max（）中第一项是谱范数，由于谱范数是p=2的诱导范数，所以右下角写的2！

The Methods of Augmented Lagrange Multipliers

ALM简述：比拉格朗日乘子法多了（12）中的第3项

ALM优点：

1.收敛性：

当 {u_k} 为递增序列，f ，h 为 continuously di®erentiable 时：

{u_k} is bounded ： Q-linearly

{u_k} is unbounded ：super-Q-linearly

各种收敛定义：https://en.wikipedia.org/wiki/Rate_of_convergence

2.算法3中第4行 Y_k 的最优更新步长（学习率）可以证明设为 u_k 最好，调参更容易了。

3.ALM收敛到精确解，而之前的 IT 和 APG 都是近似解。

Two ALM Algorithms for Robust PCA (Matrix Recovery)

对（2）使用ALM：

注意几点：

1.Y的初值设置

2.由于（2） non-smooth ，所以 ALM 的收敛性结论不能直接用，但是论文证明了结论依然成立。

3. u_k 增加越快，收敛越快；但是 u_k 过大，算法4中第3，6行求 A E 的两个子问题中 IT 就会变慢，所以有个折中。

下面给出 EALM：

注意到算法4中第5行，while，反复对A E求min，得出第3行结果，这个过程算法执行 SVD 的次数太多，所以改进：

其实不需要 while 只需要一次就好（感觉其实本质就是 Alternative direction method），得到 IALM：

文章证明了 IALM 也能收敛到最优解，但是收敛率的证明比较困难，但是实验证明还是二次收敛，但当 u_k 增长过快时，就不能保证收敛到最优解了，所以 u_k 的调参是一个权衡，详见文章 P9 Theorem 2.

An ALM Algorithm for Matrix Completion

之前做的矩阵恢复，做矩阵填充也可以。

矩阵填充优化问题：

把矩阵填充问题表示成类似（2）的形式：

partial augmented Lagrangian function：

for updating E the constraint （（15）中第二个约束） should be enforced when minimizing （16）

与之前一模一样了，算法如下：

由于第7行，所以 Y 在需要填充的位置上永远为 0.

References

[1] Wright, J., Ganesh, A., Rao, S., Ma, Y.: Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. submitted to Journal of the ACM (2009)

[2] Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming (web page and software).http://stanford.edu/»boyd/cvx (2009)

[3] Yin, W., Hale, E., Zhang, Y.: Fixed-point continuation for L1-minimization: methodologyand convergence. preprint (2008)

[4] Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences 2(1), 183{202 (2009)

[5] Yin, W., Osher, S., Goldfarb, D., Darbon, J.: Bregman iterative algorithms for L1-minimization with applications to compressed sensing. SIAM Journal on Imaging Sciences

1(1), 143{168 (2008)

[6] Cai, J.F., Osher, S., Shen, Z.: Linearized Bregman iterations for compressed sensing. Math.Comp. 78, 1515{1536 (2009)

[7] Cai, J., Candµes, E., Shen, Z.: A singular value thresholding algorithm for matrix completion. preprint, code available athttp://svt.caltech.edu/code.html (2008)

[8] Lin, Z., Ganesh, A., Wright, J., Wu, L., Chen, M., Ma, Y.: Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. SIAM J. Optimization

[9] Toh, K.C., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. preprint (2009)

[10] Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization.submitted to SIAM Journal on Optimization (2008)

[11] Wright, J., Ganesh, A., Rao, S., Peng, Y., Ma, Y.: Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In: Proceedings of Advances in Neural Information Processing Systems (2009)

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航