How to project a new point to PCA new basis?

69 views (last 30 days)
Evgheny
Evgheny on 9 Nov 2012
Answered: Jun Ming Soh on 8 Mar 2022
For example, I have 9 variables and 362 cases. I've made PCA calculation, and found out that first 3 PCA coordinates are enough for me.
Now, I have new point in my 9-dimensional structure, and I want to project it to principal component system coordinate. How to get its new coordinates?
% here is data (362x9)
load SomeData
[W, Y] = pca(data, 'VariableWeights', 'variance', 'Centered', true);
% orthonormal coefficient matrix
W = diag(std(data))\W;
% Getting mean and weights of data (for future data)
[data, mu, sigma] = zscore(data);
sigma(sigma==0) = 1;
% New point in original 9dim system
% For example, it is the first point of our input data
x = data(1,:);
x = bsxfun(@minus,x, mu);
x = bsxfun(@rdivide, x, sigma);
% New coordinates as principal components
y0 = Y(1,:); % point we should get in result
y = (W*x')'; % our result
% error
sum(abs(y0 - y)) % 142 => they are not the same point
% plot
figure()
plot(y0,'g'); hold on;
plot(y,'r');
How to get coordinates of a new point projected to new principal component basis?

Answers (3)

Wei Wang
Wei Wang on 9 Nov 2012
When you specify a variable weight, the coefficient (W in your code) is not orthonormal, but the reconstruction rule is still Xcentered= score*coeff'. To get the score, you would have to do Xcentered/coeff' instead of Xcentered*coeff;
See the example below:
load hald;
data = ingredients;
% Weight and Mean:
wt = 1./var(data);
mu = mean(data);
% PCA, W is coefficient and Y is the score
[W,Y]=pca(data,'VariableWeights',wt,'centered',true);
% First observation of the centered data and its score
x1 = data(1,:)-mu;
y1 = Y(1,:)
% According to the reconstruction rule, we should have x1=y1*W'
% therefore, y1 = x1/W'
y = x1/W'

Jun Ming Soh
Jun Ming Soh on 8 Mar 2022
Given X is an n observations x p variables/parameters table, after calculating the PCA with 'Economy' set as false, size(coeff) = p x p, size(score) = n x p, size(percentage_variance) = p x 1.
[coeff, score, ~,~,percentage_variance] = pca(X, 'Economy', false);
If there is a new datapoint with size of 1 x p, the corresponding principle component will be
score_new = new_datapoint * coeff;

Nikos Mp
Nikos Mp on 31 Aug 2017
To project the old data to PC3 we should project them to PC1 then to PC2 then to PC3 ? Or we can do it using the data(:,1:3)*coeffs(:,1:3)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!