PCA in Matlab reduce dimensionality

I just want to have a simple PCA to reduce my dimensionality of let say 400 * 5000 to 400 * 4
meaning reduce from 5000 to 4.
I am not sure where can i set the value of reduction.
coeff = pca(X)
I am trying to follow:
load hald
Then:
The dataset of ingredient is 13 * 4
Capture.PNG
coeff = pca(ingredients)
Output:
coeff = 4×4
-0.0678 -0.6460 0.5673 0.5062
-0.6785 -0.0200 -0.5440 0.4933
0.0290 0.7553 0.4036 0.5156
0.7309 -0.1085 -0.4684 0.4844
I am wondering can i change it to output of 13 *2

6 Comments

coeff gives you the principal component vectors as columns, score maps your data onto these and gives you your data re-aligned onto the principal components rather than x, y, z, etc original variable axes.
Thanks!
Let say I have a dataset of 13*8 matrix
coefficient is 8* 8 matrix
score is 13 * 8 matrix
Make sense.
So let say I want the final output to be 13*2 matrix only, or even 13*6 matrix.
How can I do that?
Just do as Elysi Cochin's answer shows and index into the scores. These are ordrered from 1st principal componet onwards so just throw away those you don't want.
Thanks.
I did that. However, it seemed throw away those matrix I do not want, is that means missing out some information by throwing away?
For example:
load hald
[coeff, score] = pca(ingredients);
reducedDimension = score(:,1:3);
Result of Score is 13*4 matrix
Capture.PNG
Result of ReduceDimension is 13*3 matrix
ssss.PNG
It looks like the 4th row is throwing away, is that mean dimension reduction using PCA?
looks like throwing the 4th row will miss some information?
Adam
Adam on 20 Feb 2019
Edited: Adam on 20 Feb 2019
Dimension reduction is 'throwing some information away'. It isn't magic, unfortunately. Unless you have perfectly correlated redundant variables then if you have 8 variables and you want to reduce down to 3 dimensions then you will obviously lose some information.
Of course, doing it without PCA you would lose a huge amount of information if you just chop off 5 variables.
Because you have used PCA though you are throwing away the dimensions that contain least information about the data.
Looking at the explained output from PCA will help you see what you are throwing away. This is a measure of how much of the data variation is captured by each dimension. You will usually see a large number (between 0 and 100, e.g. 80) for the first, then progressivley smaller numbers. Unless your data is very random you will often find that after the first few principal components the values in the explained vector are < 1 (i.e. that dimension hold less than 1% of the information so that is all you lose if you throw that dimension away).
Thanks for your reply.
Yes, I checked the file of the PCA output, you are correct, usually large number for the first row and progressively smaller number.
Thanks once again.
Do you have any idea how can we use Linear Discriminant Analysis (LDA) aka. Fisher Discriminant Analysis (FDA) in matlab? It seemed do not have this function.

Sign in to comment.

 Accepted Answer

[coeff, score] = pca(ingr);
requiredResult = score(:,1:2);
or if you want to change coeff to 13 x 2 matrix, you'll have to use reshape function, but to use reshape your variable coeff must have atleast 13 x 2 elements
or you can use repmat, it will repeat copies of the array coeff

2 Comments

Thanks!
Do you mind explain what is the different between "coeff" and "score"?
I did read the documenation, unable to understand.
load hald
[coeff, score] = pca(ingredients);
requiredResultscore = score(:,1:3);
requiredResultcoeff = coeff(:,1:3);
Orginal "ingredients" is 13*4 matrix
coefficient is 4 * 4 matrix
score is 13 * 4 matrix
requiredResultscore is 13 * 3 matrix
requiredResultcoeff is 4 * 3 matrix
The original dataset which is 'ingredient' is 13 * 4 matrix.
>> ingredients
ingredients =
7 26 6 60
1 29 15 52
11 56 8 20
11 31 8 47
7 52 6 33
11 55 9 22
3 71 17 6
1 31 22 44
2 54 18 22
21 47 4 26
1 40 23 34
11 66 9 12
10 68 8 12
After PCA:
load hald
coeff = pca(ingredients)
The output is of coeff is 4 * 4 matrix.
>> coeff
coeff =
-0.0678 -0.6460 0.5673 0.5062
-0.6785 -0.0200 -0.5440 0.4933
0.0290 0.7553 0.4036 0.5156
0.7309 -0.1085 -0.4684 0.4844
I am wondering how can I get a 13 * 2 matrix as output.
In your question "to use reshape your variable coeff must have atleast 13 x 2 elements". How can I get at least 13 * 2 elements.
Thanks

Sign in to comment.

More Answers (0)

Categories

Tags

Asked:

on 19 Feb 2019

Commented:

on 21 Feb 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!