How to perform Principal component analysis

Hi,
I have 10 variables, and the correlation between each single variable is very poor, so I want to perform the PCA such as to see the correlation by grouping the variable based on their similar behaviour (similar Rsquare or similar correlation coefficient). Please someone help.
My input data(Each column represent a variable, column1-->Variable1, Column2--> Varaible2,...Column10-->Variable10, for each variable I have 25 observations)
0.74 0.83 0.85 0.63 0.15 0.62 0.56 0.18 0.46 0.53
0.39 0.77 0.56 0.66 0.19 0.57 0.85 0.21 0.10 0.73
0.68 0.17 0.93 0.73 0.04 0.05 0.35 0.91 1.00 0.71
0.70 0.86 0.70 0.89 0.64 0.93 0.45 0.68 0.33 0.78
0.44 0.99 0.58 0.98 0.28 0.73 0.05 0.47 0.30 0.29
0.02 0.51 0.82 0.77 0.54 0.74 0.18 0.91 0.06 0.69
0.33 0.88 0.88 0.58 0.70 0.06 0.66 0.10 0.30 0.56
0.42 0.59 0.99 0.93 0.50 0.86 0.33 0.75 0.05 0.40
0.27 0.15 0.00 0.58 0.54 0.93 0.90 0.74 0.51 0.06
0.20 0.20 0.87 0.02 0.45 0.98 0.12 0.56 0.76 0.78
0.82 0.41 0.61 0.12 0.12 0.86 0.99 0.18 0.63 0.34
0.43 0.75 0.99 0.86 0.49 0.79 0.54 0.60 0.09 0.61
0.89 0.83 0.53 0.48 0.85 0.51 0.71 0.30 0.08 0.74
0.39 0.79 0.48 0.84 0.87 0.18 1.00 0.13 0.78 0.10
0.77 0.32 0.80 0.21 0.27 0.40 0.29 0.21 0.91 0.13
0.40 0.53 0.23 0.55 0.21 0.13 0.41 0.89 0.53 0.55
0.81 0.09 0.50 0.63 0.56 0.03 0.46 0.07 0.11 0.49
0.76 0.11 0.90 0.03 0.64 0.94 0.76 0.24 0.83 0.89
0.38 0.14 0.57 0.61 0.42 0.30 0.82 0.05 0.34 0.80
0.22 0.68 0.85 0.36 0.21 0.30 0.10 0.44 0.29 0.73
0.79 0.50 0.74 0.05 0.95 0.33 0.18 0.01 0.75 0.05
0.95 0.19 0.59 0.49 0.08 0.47 0.36 0.90 0.01 0.07
0.33 0.50 0.25 0.19 0.11 0.65 0.06 0.20 0.05 0.09
0.67 0.15 0.67 0.12 0.14 0.03 0.52 0.09 0.67 0.80
0.44 0.05 0.08 0.21 0.17 0.84 0.34 0.31 0.60 0.94
Many thanks in advance.

 Accepted Answer

If you have the Statistics and Machine Learning Toolbox, you can used the pca function.

More Answers (1)

See plotmatrix() in the Statistics and Machine Learning Toolbox.
To "see the correlation":
plotmatrix(yourMatrix);

5 Comments

Hi,
I can compute the Rsquare matrix, but the correlation between each single variable is very poor. My idea is as follows. Because the individual variables are poorly correlated, I expect that maybe by grouping variables, I will see the correlation between the groups may worthwhile. Can this be performed using PCA or any other tool? For example : Variable1, 3,& 4 is group1, variable2, & 6 is group2, variable5, & 9 group3, variable7, 8, & 10 group4, and see the correlation between groups 1 through 4.
You can do
m = [...
0.74 0.83 0.85 0.63 0.15 0.62 0.56 0.18 0.46 0.53
0.39 0.77 0.56 0.66 0.19 0.57 0.85 0.21 0.10 0.73
0.68 0.17 0.93 0.73 0.04 0.05 0.35 0.91 1.00 0.71
0.70 0.86 0.70 0.89 0.64 0.93 0.45 0.68 0.33 0.78
0.44 0.99 0.58 0.98 0.28 0.73 0.05 0.47 0.30 0.29
0.02 0.51 0.82 0.77 0.54 0.74 0.18 0.91 0.06 0.69
0.33 0.88 0.88 0.58 0.70 0.06 0.66 0.10 0.30 0.56
0.42 0.59 0.99 0.93 0.50 0.86 0.33 0.75 0.05 0.40
0.27 0.15 0.00 0.58 0.54 0.93 0.90 0.74 0.51 0.06
0.20 0.20 0.87 0.02 0.45 0.98 0.12 0.56 0.76 0.78
0.82 0.41 0.61 0.12 0.12 0.86 0.99 0.18 0.63 0.34
0.43 0.75 0.99 0.86 0.49 0.79 0.54 0.60 0.09 0.61
0.89 0.83 0.53 0.48 0.85 0.51 0.71 0.30 0.08 0.74
0.39 0.79 0.48 0.84 0.87 0.18 1.00 0.13 0.78 0.10
0.77 0.32 0.80 0.21 0.27 0.40 0.29 0.21 0.91 0.13
0.40 0.53 0.23 0.55 0.21 0.13 0.41 0.89 0.53 0.55
0.81 0.09 0.50 0.63 0.56 0.03 0.46 0.07 0.11 0.49
0.76 0.11 0.90 0.03 0.64 0.94 0.76 0.24 0.83 0.89
0.38 0.14 0.57 0.61 0.42 0.30 0.82 0.05 0.34 0.80
0.22 0.68 0.85 0.36 0.21 0.30 0.10 0.44 0.29 0.73
0.79 0.50 0.74 0.05 0.95 0.33 0.18 0.01 0.75 0.05
0.95 0.19 0.59 0.49 0.08 0.47 0.36 0.90 0.01 0.07
0.33 0.50 0.25 0.19 0.11 0.65 0.06 0.20 0.05 0.09
0.67 0.15 0.67 0.12 0.14 0.03 0.52 0.09 0.67 0.80
0.44 0.05 0.08 0.21 0.17 0.84 0.34 0.31 0.60 0.94 ]
% plotmatrix(m)
[coeff,score,latent,tsquared,explained,mu] = pca(m)
I didn't group columns together. You can concatenate columns to do your groupings.
Sir,
I understood, but sir PCA do have option such that PCA itself automatically execute this.
I don't know what that means. There is no question mark, so is that a question? What do you mean by automatically as opposed to manually in this situation?
But how can you group different number of observations (columns) together. If so, then how can you compare a new columns with 3 columns grouped together with another one that has only 2 columns grouped together?
Mekala, PCA is a specific technique that has a specific use. It seems like you need a deeper understand of the technique. It is difficult to teach you all of PCA in this forum.
What PCA "automatically" does is calculate the combination of variables that explains the most variation of another variable. There is no "manual" grouping in the function.

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!