您的位置：首页 > 其它

主成分分析（principal components analysis, PCA）

2014-05-19 17:03 323 查看

首先推荐一篇关于eigenvalue/eigenvector的文章，有助于理解PCA的理论原理：http://blog.sina.com.cn/s/blog_49a1f42e0100fvdu.html

PCA的思想就是找到数据集的主轴方向，由这些主轴构成一个新的坐标系，它的维数小于原数据，这样原数据向新的坐标系投影，这个投影的过程就是降维的过程，得到的新数据就是主观认为的数据特征。

PCA的缺点，忽略了原数据矩阵中，向量分量间的顺序是有意义的，顺序的不同代表了完全不同的信息，不能判断一个矩阵的局部其实是对应另一个矩阵上不同位置的局部。

PCA算法步骤：

（1）获得样本特征平均矩阵meanX和协方差举证covX；

（2）对角化矩阵covX，获得转换矩阵W和对应的特征值，W就是新的坐标系，W按照对应特征值俺由大到小排列。

（3）原数据X在坐标系W投影，获得降维数据Y；

贴一段matlab实现的PCA代码：

%PCA

%Feature Matricx cx. Each column represents a feature and
%each row a sample data

cx = [1.4000 1.5500
3.0000 3.2000
0.6000 0.7000
2.2000 2.3000
1.8000 2.1000
2.0000 1.6000
1.0000 1.1000
2.5000 2.4000
1.5000 1.6000
1.2000 0.8000
2.1000 2.5000];
[m, n] = size(cx);

%Data Graph
figure(1);
plot(cx(:,1),cx(:,2),'k+');    hold on;    %Data
plot(([0,0]),([-1,4]),'k-');   hold on;    %X axis
plot(([-1,4]),([0,0]),'k-');               %Y axis
axis([-1,4,-1,4]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Original Data');

%Covariance Matrix
covX=cov(cx);
%Covariance Matrix using the matrix definition
meanX=mean(cx);
cx1=cx(:,1)-meanX(1);
cx2=cx(:,2)-meanX(2);
Mcx=[cx1 cx2];
covX=(transpose(Mcx)*(Mcx))/(m-1);

pause();

%Covariance Matrix using alternative definition
meanX=mean(cx);
covX=((transpose(cx)*(cx))/(m-1))-((transpose(meanX)*meanX)*(m/(m-1)));

%Compute Eigenvalues and Eigenvector
[W L]=eig(covX);   %W=Eigenvalues L=Eigenvector

%Eigenvector Graph
figure(2);
plot(cx(:,1), cx(:,2), 'k+');  hold on;
plot(([0,W(1,1)*4]), ([0,W(1,2)*4]),'k-'); hold on;
plot(([0,W(2,1)*4]), ([0,W(2,2)*4]),'k-');
axis([-4,4,-4,4]);
xlabel('Fecture 1');
ylabel('Fecture 2');
title('Eigenvectors');

%Transform Data
cy=cx*transpose(W);

%Graph Transformed Data
figure(3);
plot(cy(:,1),cy(:,2),'k+');    hold on;
plot(([0,0]),([-1,5]),'k-');   hold on;
plot(([-1,5]),([0,0]),'k-');
axis([-1,5,-1,5]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Transformed Data');

%Classification example
meanY=mean(cy);

%Graph of classification example
figure(4);
plot(([-5,5]),([meanY(2),meanY(2)]),'k:');    hold on;
plot(([0,0]),([-1,5]),'k-');   hold on;
plot(([-1,5]),([0,0]),'k-');   hold on;
plot(cy(:,1),cy(:,2),'k+');    hold on;
axis([-1,5,-1,5]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Classification Example');
legend('Mean',2);

%Compression example d
cy(:,1)=zeros;
xr=transpose(transpose(W)*transpose(cy));

%Graph of compression example
figure(5);
plot(xr(:,1),xr(:,2),'k+');    hold on;
plot(([0,0]),([-1,4]),'k-');   hold on;
plot(([-1,4]),([0,0]),'k-');   hold on;
plot(cx(:,1),cx(:,2),'r+');    hold on;
axis([-1,4,-1,4]);
xlabel('Feature 1');
ylabel('Feature 2');
title('Compression Example');

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航