Correlation and regression between matrixes with NaN values
7 views (last 30 days)
I want to calcolate the regression and correlation coefficent between two matrixes (temperature and sea level pressure), having the same dimension 241 x 81 but containg some NaN values.
The final goal is to have a two dimensions matrix that I can plot (see attached image), i.e. for every point in the map I have a value for my correlations and regression coefficents
Thank you a lot!
MarKf on 5 Sep 2023
I see, "array1" has some islands of values in a sea of NaNs.
ar1 = load(websave('rd', "https://nl.mathworks.com/matlabcentral/answers/uploaded_files/1473551/array1.mat"));
ar2 = load(websave('rd', "https://nl.mathworks.com/matlabcentral/answers/uploaded_files/1473556/array2.mat"));
a1 = ar1.array3; a2 = ar2.d;
ar1_0s = a1; ar1_0s(isnan(ar1_0s)) = 0; imagesc((ar1_0s)*10^2+a2); %here to visualize what I mean
So you have only sum(sum(~isnan(a1))) = 1719 non-NaNs values to correlate. You cannot do a map with 2D locations of those islands as I mention in the comment above, unless you have a couple of vectors for each of those locations you want to correlate. I just thought that you could also normxcorr2 but again that's probably not what you want given that these are geo/meteorogical data.
You could still correlate the values for each location that you have, that is a1(:) and a2(:) (converting each input into its vector representation), corrcoef does that automatically:
To get rho = 0.5341
dpb on 5 Sep 2023
Edited: dpb on 5 Sep 2023
"...regression and correlation coefficent between two matrixes (temperature and sea level pressure), ... to have a .... for every point in the map .. value for my correlations and regression coefficents"
whos -file array1
whos -file array2
Pretty meaningless variable names, one presumes the 10E5 must be P and the other by elimination T?
However, for each point in the 2D array there is only one value each for T, P, so there is no "regression" or "correlation" of the two on a pointwise basis. You can look at the overall correlation between the two variables, but there's nothing to regress against or compare pointwise.
gives the overall correlation between the two arrays for the locations that are both finite in the same positions; that's about all there is to be gained from these data in that regards.
What might be interesting would be
Indeed...there are some pretty clear correlations amongst given sets of data; the various columns are heavily correlated in having a definite set of trends but it is the relationship from one observation to another that is correlated, not that the two variables are highly (linearly) correlated.
Wonder how many columns contain at least one observation...
So, 41 out of the 81 columns have at least one observation so there are 41 separate traces above...
What, this means I dunno, but is pretty interesting -- and indicates that the overall correlation coefficient doesn't really indicate much and probably is of no practical value.