A Gentle Introduction to Singular-Value Decomposition for Machine Learning
2018-02-26 15:59
567 查看
机器学习中奇异值分解的入门指导
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/矩阵分解:也叫做矩阵因子分解。用矩阵的构成元素来描述一个给定的矩阵。
最为大家所知的矩阵分解应该就是奇异值分解了,简称为SVD。
所有的矩阵都存在奇异值分解,通常用作压缩,去噪,和数据约减。
接下来学习如何进行奇异值分解。
奇异值分解
A = U.Sigma.V^T
其中A是所给的一个m*n阶的矩阵,
那么U是一个m*m阶的方阵,
Sigma是一个m*n的“对角阵”,对角线上的值称之为矩阵A的奇异值
V是一个n*n阶的方阵。
SVD通常用于矩阵的计算操作,比如矩阵转换,还有机器学习中的数据约减。
SVD还可以用于最小二乘线性回归,图像压缩,数据去噪。
计算SVD
# Singular-value decomposition from numpy import array from scipy.linalg import svd # define a matrix A = array([[1, 2], [3, 4], [5, 6]]) print('A:', A) #SVD U, s, V = svd(A) print('U:', U) print('Sigma:', s) print('V:', V)
A: [[1 2] [3 4] [5 6]] U: [[-0.2298477 0.88346102 0.40824829] [-0.52474482 0.24078249 -0.81649658] [-0.81964194 -0.40189603 0.40824829]] Sigma: [ 9.52551809 0.51430058] V: [[-0.61962948 -0.78489445] [-0.78489445 0.61962948]]
从SVD中重建矩阵
原始的矩阵可以由U, Sigma, V重新获得。# reconstruct SVD from numpy import array, diag, dot, zeros from scipy.linalg import svd # define a matrix A = array([[1, 2], [3, 4], [5, 6]]) print('A:', A) # singular-value decomposition U, s, V = svd(A) # create m*n Sigma matrix Sigma = zeros((A.shape[0], A.shape[1])) # populate Sigma with n*n diagonal matrix Sigma[:A.shape[1], :A.shape[1]] = diag(s) # reconstruct matrix B = U.dot(Sigma.dot(V)) print('B:', B)
A: [[1 2] [3 4] [5 6]] B: [[ 1. 2.] [ 3. 4.] [ 5. 6.]]
如果矩阵A是一个方阵,那么Sigma就是一个对角阵。
可以很方便的构造Sigma,Sigma=diag(s).
奇异值分解用于求伪逆
对不是方阵的矩阵求逆,就是伪逆,也称之为Moore-Penrose逆。伪逆就定义为A^+.
那么
A^+ = V . D^+ . U^T
其中A^+是伪逆, D^+是Sigma矩阵的伪逆。
# Pseudoinverse from numpy import array from numpy.linalg import pinv # define matrix A = array([ [0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8]]) print('A:', A) # calculate pseudoinverse B = pinv(A) print('B:', B)
A: [[ 0.1 0.2] [ 0.3 0.4] [ 0.5 0.6] [ 0.7 0.8]] B: [[ -1.00000000e+01 -5.00000000e+00 9.04831765e-15 5.00000000e+00] [ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
根据SVD来计算伪逆
# Pseudoinverse via SVD from numpy import array, zeros, diag from numpy.linalg import svd # define matrix A = array([ [0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8]]) print('A:', A) # calculate svd U, s, V = svd(A) # reciprocals of s d = 1.0 / s D = zeros(A.shape) D[:A.shape[1], :A.shape[1]] = diag(d) B = V.T.dot(D.T).dot(U.T) print('B:', B)
A: [[ 0.1 0.2] [ 0.3 0.4] [ 0.5 0.6] [ 0.7 0.8]] B: [[ -1.00000000e+01 -5.00000000e+00 9.04831765e-15 5.00000000e+00] [ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
SVD用于维度约减
当数据有大量的特征,即列数(特征数)远大于行数(观测样本数)时,可以对数据进行减少,压缩到一个特征数较少的一个子数据集。from numpy import array, diag, zeros from scipy.linalg import svd # define a matrix A = array([ [1,2,3,4,5,6,7,8,9,10], [11,12,13,14,15,16,17,18,19,20], [21,22,23,24,25,26,27,28,29,30]]) print('A:', A) # Singular-value decomposition U, s, V = svd(A) # create m*n Sigma matrix Sigma = zeros((A.shape[0], A.shape[1])) # populate Sigma with n*n diagonal matrix Sigma[:A.shape[0], :A.shape[0]] = diag(s) # select n_elements = 2 Sigma = Sigma[:, :n_elements] V = V[:n_elements, :] # reconstruct B = U.dot(Sigma.dot(V)) print('B:', B # transform T = U.dot(Sigma) print('T:', T) T = A.dot(V.T) print('T:', T)
from numpy import array from sklearn.decomposition import TruncatedSVD # define a matrix A = array([ [1,2,3,4,5,6,7,8,9,10], [11,12,13,14,15,16,17,18,19,20], [21,22,23,24,25,26,27,28,29,30]]) print('A:', A) svd = TruncatedSVD(n_components=2) svd.fit(A) result = svd.transform(A) print('result:', result)
A: [[ 1 2 3 4 5 6 7 8 9 10] [11 12 13 14 15 16 17 18 19 20] [21 22 23 24 25 26 27 28 29 30]] result: [[ 18.52157747 6.47697214] [ 49.81310011 1.91182038] [ 81.10462276 -2.65333138]]
相关文章推荐
- A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning
- Very Brief Introduction to Machine Learning for AI
- 【转载】Very Brief Introduction to Machine Learning for AI
- A Gentle Introduction to Applied Machine Learning as a Search Problem (译文)
- 【菜鸟学深度】Introduction to Machine Learning CMU-10701
- Introduction to Machine Learning
- An introduction to machine learning with scikit-learn
- Introduction to Machine Learning
- A Gentle Guide to Machine Learning
- An Introduction to Machine Learning with Python
- Introduction to Machine Learning (一)
- In-depth introduction to machine learning in 15 hours of expert videos
- Andrew NG机器学习课程笔记系列之——Introduction to Machine Learning
- advice for applying machine learning:Deciding what to do next
- A Large set of Machine Learning Resources for Beginners to Mavens
- Introduction To Machine Learning Self-Evaluation Test
- An introduction to machine learning with scikit-learn
- Quick Introduction to Boosting Algorithms in Machine Learning
- How to use data analysis for machine learning (example, part 1)
- Introduction to Machine Learning