演示如何使用偏最小二乘回归方法
2014-03-04 07:51
274 查看
http://www.cdpcenter.org/files/plsr/ 点击打开链接
here. The M-file implementing the NIPALS algorithm may be downloaded
here (html version).
Spectra plots
Create stock solutions
Create sample solutions
Create sample spectra
Construct the raw inputs (X) and outputs (Y).
Write the outputs to a text file for analysis in Simca-P
Apply PLSR to the data
Analysis of predictive ability
Analysis of the Y loadings
Analysis of the X loadings
Analysis of the scores
Training the PLSR model mostly with blue solutions
Stock 1 consists of 67% red1, 27% red2, and 6% yellow
Stock 2 consists of 36% red1, 56% red2, and 8% yellow
Stock 3 consists of 95% blue and 5% yellow
Dilutions of stock 1 (samples1)
Dilutions of stock 2 (samples2)
Dilutions of stock 3 (samples3)
Mixtures of stocks 1 and 3 (samples13)
Mixtures of stocks 2 and 3 (samples23)
The two outputs (Yraw) are
the total concentration of red dyes, and
the concentratinon blue dye.
Then, let's extract three principal components
significant.
your data. So if most of your samples are just "blue", then PC1 will be a blue spectrum. If most of your samples are mixtures, then PC1 will be a mixture of the spectra.
Let's confirm this by adding a bunch more blue solutions to our samples.
Published with MATLAB® 7.4
PLSR Demonstration
This is an example of PLSR on simulated spectroscopic data to determine the concentration of red and blue dye in various solutions. The M-file may be downloadedhere. The M-file implementing the NIPALS algorithm may be downloaded
here (html version).
Contents
Constructing the dataSpectra plots
Create stock solutions
Create sample solutions
Create sample spectra
Construct the raw inputs (X) and outputs (Y).
Write the outputs to a text file for analysis in Simca-P
Apply PLSR to the data
Analysis of predictive ability
Analysis of the Y loadings
Analysis of the X loadings
Analysis of the scores
Training the PLSR model mostly with blue solutions
Constructing the data
Let's contruct spectra for 4 dyes (2 red, 1 blue, and 1 yellow) at 40 points (x, that presumabely correspond to wavelengths):randn('state', 0); numInputs = 40; x = linspace(-8,12,numInputs); red1Spectrum = 10*evpdf(-x,-2,2); red2Spectrum = 9.8*evpdf(-x,-1.5,2); blueSpectrum = 4*evpdf(-x,4,1.); yellowSpectrum = 5*evpdf(-x,1,1.2);
Spectra plots
Combine the spectra into one array and plot themdyeSpectra = [ red1Spectrum red2Spectrum blueSpectrum yellowSpectrum]; set(gcf,'DefaultAxesColorOrder',[1 0 0; 0.5 0 0; 0 0 1; 1 1 0]); plot(1:numInputs,dyeSpectra); xlabel('Wavelength') ylabel('Optical Density') legend('red1','red2','blue','yellow') title('Spectra for dyes')
Create stock solutions
Let's have 3 stock solutions:Stock 1 consists of 67% red1, 27% red2, and 6% yellow
Stock 2 consists of 36% red1, 56% red2, and 8% yellow
Stock 3 consists of 95% blue and 5% yellow
stocks= [ 0.67 0.27 0.00 0.06 0.36 0.56 0.00 0.08 0.00 0.00 0.95 0.05];
Create sample solutions
Let's make samples of each stock at different concentrations, and a few mixtures:Dilutions of stock 1 (samples1)
Dilutions of stock 2 (samples2)
Dilutions of stock 3 (samples3)
Mixtures of stocks 1 and 3 (samples13)
Mixtures of stocks 2 and 3 (samples23)
samples1 = [0.01 0.02 0.04 0.08]'*stocks(1,:); samples2 = [0.03 0.06 0.09 0.12]'*stocks(2,:); samples3 = [0.01 0.02 0.04 0.08 0.16]'*stocks(3,:); samples13 = ([0.03 0.05 0.08]'*stocks(1,:) + [0.07 0.05 0.02]'*stocks(3,:)); samples23 = ([0.02 0.03 0.04]'*stocks(2,:) + [0.03 0.02 0.01]'*stocks(3,:)); samples = [samples1; samples2; samples3; samples13; samples23];
Create sample spectra
sampleSpectra = samples*dyeSpectra; sampleSpectra = max(1e-6, sampleSpectra + 0.002*randn(size(sampleSpectra)) ); plot(sampleSpectra(end-2,:),'k-') xlabel('Wavelength') ylabel('Optical Density') title('Spectrum for a mixture of red + blue')
Construct the raw inputs (X) and outputs (Y).
The inputs (Xraw) are just the sample spectra.The two outputs (Yraw) are
the total concentration of red dyes, and
the concentratinon blue dye.
Xraw = sampleSpectra; Yraw = [samples(:,1)+samples(:,2), samples(:,3)];
Write the outputs to a text file for analysis in Simca-P
csvwrite('spec.txt',[Yraw Xraw])
Apply PLSR to the data
First mean-center the data. However, we should not variance scale the data because all the are all in the same units, and larger data points really should be weighted more heavily (because they are more informative).Then, let's extract three principal components
numSamples = size(Xraw,1); numOutputs = size(Yraw,2); Xmean = mean(Xraw,1); X = Xraw - repmat(Xmean, [numSamples 1]); % Xmean = mean(Xraw,2); % X = Xraw - repmat(Xmean, [1 numInputs]); Ymean = mean(Yraw,1); Y = Yraw - repmat(Ymean, [numSamples 1]); % Ymean = mean(Yraw,2); % Y = Yraw - repmat(Ymean, [1 numOutputs]); [P,Q,W,B] = nipals(X,Y,3); T = X*W; U = Y*Q; Ypred = X*W*B*Q'; YrawPred = Ypred + repmat(Ymean, [numSamples 1]); % YrawPred = Ypred + repmat(Ymean, [1 numOutputs]);
Analysis of predictive ability
plot(Yraw(:,1),Yraw(:,2),'kx',YrawPred(:,1),YrawPred(:,2),'bo') xlabel('Red Conc.') ylabel('Blue Conc.') legend('Expt.', 'Pred.') title('Comparison of Predictive Ability')
Analysis of the Y loadings
PC1 is has a positive "red" weight and a negative "blue" weight. PC2 is mostly "blue", but some "red."clf colormap([1 0 0; 0 0 1]); bar(Q') set(gca,'XTickLabel',{'PC1','PC2','PC3'}) legend('Red','Blue'); title('Y Loadings')
Analysis of the X loadings
The X loadings are linear combinations of "red" and "blue" spectra. As indicated by the Y loadings, PC1 is "red" minus "blue", and PC2 is "blue" plus a little "red". PC3 looks like a lot of noise. As we'll later see using Simca-P, PC3 is not statisticallysignificant.
clf plot(1:numInputs,P); legend('PC1','PC2','PC3') xlabel('Arbitrary Wavelength') title('X Loadings');
Analysis of the scores
s = {samples1, samples2, samples3, samples13, samples23}; legendText = {'1','2','3','1+2','2+3'}; symbols = 'xo+s^'; colors = [1 0 0; 0.5 0 0; 0 0 1; 0.7 0 0.7; 0.3 0 0.3]; clf hold on idx1 = 1; for i=1:numel(s) idx2 = idx1 + size(s{i},1) - 1; Ti = T(idx1:idx2,:); plot(Ti(:,1),Ti(:,2),symbols(i),'MarkerFaceColor',colors(i,:),'MarkerEdgeColor',colors(i,:)); idx1 = idx2+1; end xlabel('PC1'); ylabel('PC2'); title('X Scores') legend(legendText{:}); hold off
Training the PLSR model mostly with blue solutions
At first, I was surprised to find that each principal component doesn't correspond to a single color. Then I realized it has a lot to do with the data you use to train the model. The first principal component is the one that explains the most variance inyour data. So if most of your samples are just "blue", then PC1 will be a blue spectrum. If most of your samples are mixtures, then PC1 will be a mixture of the spectra.
Let's confirm this by adding a bunch more blue solutions to our samples.
n = 100; blueConc = 0.5*randn(n,1); samplesB = zeros(n,4); samplesB(:,3) = blueConc; samplesB = [samples; samplesB]; YrawB = [samplesB(:,1)+samplesB(:,2), samplesB(:,3)]; sampleSpectraB = blueConc*blueSpectrum; sampleSpectraB = sampleSpectraB + max(1e-6,0.002*randn(size(sampleSpectraB))); sampleSpectraB = [sampleSpectra; sampleSpectraB]; XrawB = sampleSpectraB; csvwrite('specB.txt',[YrawB XrawB]) numSamplesB = size(XrawB,1); numOutputsB = size(YrawB,2); XmeanB = mean(XrawB,1); XB = XrawB - repmat(XmeanB, [numSamplesB 1]); YmeanB = mean(YrawB,1); YB = YrawB - repmat(YmeanB, [numSamplesB 1]); [PB,QB,WB,BB] = nipals(XB,YB,3); clf colormap([1 0 0; 0 0 1]); bar(QB') set(gca,'XTickLabel',{'PC1','PC2','PC3'}) legend('Red','Blue'); title('Y Loadings')
Published with MATLAB® 7.4
相关文章推荐
- 下面的示例演示如何使用传递到事件处理方法的 GridViewCommandEventArgs 对象确定引发事件的按钮的命令名。
- 如何得知自己正在使用的linux是什么版本呢,下面的几种方法将给你带来答案!
- JSTL演示如何使用
- iOS UIAlertController使用方法如何使用
- handler的使用---如何实现Android计时与倒计时的几种方法
- 如何正确的使用Timer的schedule()方法?
- react路由的使用方法以及通过路由如何传参传递私有属性--【基于最新版本的react-router-dom(4.2.2)】
- [VB.NET]如何在第二个窗体里使用第一个窗体的控件方法
- 如何使用Android隐藏类和隐藏方法
- 如何使用struts2中的request、response对象的方法
- android HTTP post方法时,如何使用cookies
- Linux上的文件管理类命令都有哪些,其常用的使用方法及其相关示例演示(待补全)
- (android 源码下开发应用程序) 如何在 Android 各 level ( 包含 user space 與 kernel space ) 使用dump call stack的方法
- 你最喜欢的图片替换方法是什么,你如何选择使用?
- 如何在低api中使用View的属性设置方法如setAlpha等
- iOS代码注释方法大全,图文介绍如何使用快速注释
- H.264 无参考视频质量评价方法 (使用了基于遗传编程方法的符号回归)
- [原]CentOS 6.2(32位)如何使用YUM源更新的方法
- dede使用方法---如何调用指定栏目
- 【Java面试题】17 如何把一个逗号分隔的字符串转换为数组? 关于String类中split方法的使用,超级详细!!!