Correlation computation using a window of 3

Question

0 votes

Hello

Please I have 3 column x, and y

x = 5, 8, 9, 4, 9, 6, 0 ,7 ,8 , 5, 4

y = 6, 4, 8, 7, 3, 7, 8 ,7 ,6 , 4, 7

I want to find the correlation using 3 window size computation

for instance the first 3 windows will be corr for x = 5, 8 , 9 and y = 6, 4, 8

The if the last numbers is not equal to 3 then the correlation of the numbers present is obtained in the case of

The cor for x = 5, 4 and y = 4, 7 is obtained

I get a new column for x and y with 4 rows

I need a value for the correlation instead of the corrcoef function which is giving me matrices.

Thanks for your help in advance.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Dana on 17 Aug 2020

0 votes

I don't entirely understand what you're trying to do, but you may want to use corr(a,b) instead of corrcoef(a,b) or corrcoef(C).

For a matrix C, corrcoef(C) returns a correlation matrix, i.e., a matrix whose (i,j) element is the correlation coefficient between the i-th and j-th columns of C. For column vectors a and b, the syntax corrcoef(a,b) is the same thing as corrcoef([a,b]) (i.e., MATLAB just puts the two vectors together into a single matrix, and then finds the correlation matrix).

On the other hand, corr(a,b) simply returns the correlation coefficient between the vectors a and b. Note, however, that corr([a,b]) = corrcoef(a,b) = corrcoef([a,b]), i.e., that syntax will also return the correlation matrix. So if you just want the one correlation coefficient, you need to use corr(a,b).

5 Comments
Show 3 older comments Hide 3 older comments

Dana on 17 Aug 2020

Edited: Dana on 17 Aug 2020

Open in MATLAB Online

I see now what you're trying to do. There are any number of different approaches you could take. Here's one:

x = [5, 8, 9, 4, 9, 6, 0 ,7 ,8 , 5, 4];
y = [6, 4, 8, 7, 3, 7, 8 ,7 ,6 , 4, 7];
winsz = 3;  % window size
xy = [x;y]; % combine data
nxy = size(xy,2);  % number of observations
ngr = ceil(nxy/winsz); % number of groups of size winsz
pdsz = ngr*winsz; % we will pad the data with extra elements so that the 
                                % total # of elements is evenly divisible by winsz; 
                                % pdsz is the size of the padded array
xy(:,nxy+1:pdsz) = NaN;    % pad to desired size with NaN
xy = reshape(xy,2,winsz,ngr); % reshape into a 3-D array, where 1st and 2nd row correspond
                                % to x and y, columns to winsz observations, and the 3rd 
                                % dimension to different groupings of size winsz
                                
% dv is a 1x1xngr array whose j-th element will be the number of observations in the j-th
% group; this will be equal to winsz in all but the last group
dv = winsz*ones(1,1,ngr);   
dv(ngr) = winsz-(pdsz-nxy);
xymeans = sum(xy,2,'omitnan')./dv;      % compute means of x and y for each group
xyc = xy - xymeans;                     % de-mean the observations
xystds = sqrt(sum(xyc.^2,2,'omitnan')./dv); % compute s.d.'s of x and y for each group
xycovs = sum(prod(xyc,1,'omitnan'),2)./dv); % compute covariances of x and y for each group
xycorr = reshape(xycovs./prod(xystds,1),1,ngr); % get correlation coefficients, and then
                                            % reshape 3-D result to a row vector
                                            

EDIT to say: the above uses the sample mean from each group of 3 as the mean estimate for that group. This is what would be done if you just ran a loop and called the corr function for each grouping of 3. You could substitute some other mean estimate if you wanted, though, e.g., use the same mean from the entire vectors x and y for each group. To do that, you'd instead use xyc = xy-mean([x;y],2).

Also, in hindsight, it's probably an easier option to just run a loop here. That would be noticeably slower for large arrays, but in this case it won't make an appreciable difference. So:

x = [5, 8, 9, 4, 9, 6, 0 ,7 ,8 , 5, 4].';
y = [6, 4, 8, 7, 3, 7, 8 ,7 ,6 , 4, 7].';
winsz = 3;  % window size
nxy = numel(x);  % number of observations
ngr = ceil(nxy/winsz); % number of groups of size winsz
xycorr = zeros(ngr,1);
for j = 1:ngr
    indsj = ((j-1)*winsz+1:min(j*winsz,nxy)).';
    xycorr(j) = corr(x(indsj),y(indsj));
end

As a last note, this loop method delivers the same answer as the other method above, except for in the last group. That last group only has two observations, and in that scenario you need to be more careful in calculating the correlation coefficient. In particular, +/- 1 are the only possible correlations when you have only two observations, and the method I did above won't give you that answer.

Furthermore, if you were to apply either of these methods in a situation where the last group has only 1 observation, it's not going to work at all.

Tino on 18 Aug 2020

Thank you very much am really grateful

Sign in to comment.

Correlation computation using a window of 3

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

5 Comments
Show 3 older comments Hide 3 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

Correlation computation using a window of 3

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

5 Comments Show 3 older comments Hide 3 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

5 Comments
Show 3 older comments Hide 3 older comments