Clear Filters
Clear Filters

How do I calculate a sliding correlation between two time series

15 views (last 30 days)
I would like to compute the sliding Pearson's correlation coefficient between two columns of data which represent physiologic measurements over time, at 5-second intervals. For each point in time, I want to calculate the correlation between the two columns over the preceding 5 minutes.

Answers (1)

dpb
dpb on 23 Mar 2018
Presuming no missing observations and beginning at the minute, 5 min/[5 sec/sample/60 sec/min]--> 60 samples. Given the two signals, X, Y, then also assuming they cover an integer number of minutes
X=reshape(X,60,[]);
Y=reshape(Y,60,[]);
R=diag(corr(X,Y));
If mod(length(X),60) ~=0 then either discard the end effect elements or treat those partial sections individually.
The above is somewhat wasteful in that corr calculates the full pairwise correlations between columns of X, Y and discards everything excepting the diagonal elements, but unless the series are quite long performance is probably not an issue.
One can do the two columns in loop just pairwise otherwise, of course.
  3 Comments
dpb
dpb on 23 Mar 2018
Edited: dpb on 23 Mar 2018
That's long to look at in a text editor, granted, but not particularly large for data; I don't think compute time will be an issue.
As for MV's; hmmm...have to take that under some consideration with corr; it'd mess up the above to try to use straight logical addressing with isfinite because lengths would vary.
Do you have a preferred way to consider MVs? If were to infill, then the above would work; how much that might affect the result would depend heavily on how prevalent and just what the signals look like.
The other simple alternative I see at the moment would be to revert to the loop on paired columns that could look something like...
nc=size(X,2); % # columns (after reshape() )
R=zeros(nc,1); % preallocate for output
for i=1:nc % for columns
ix=all(isfinite([X(:,i) Y(:,i)]),2); % rows neither NaN
R(i)=corr(X(ix,i),Y(ix,i));
end
Michael Wolf
Michael Wolf on 4 Apr 2018
Thank you. This has been very helpful so far. selected a segment of data such that each array had a number of elements divisible by 60, and used:
X=reshape(X,60,[]);
Y=reshape(Y,60,[]);
R=diag(corr(X,Y));
It appears that this divides my data into epochs of 60 elements, and gives me the correlation between those 60 elements for each epoch.
Is it possible to use a similar technique to generate a correlation between the previous 60 elements X and Y for every X,Y pair, rather than just non-overlapping epochs?
In other words, if X and Y each contain 61560 elements which represent two variables measured simultaneously at 5-second intervals over time, the above method gives an R containing 1026 elements. Is there a way to obtain correlations at each point, which would return an R with 61560 elements, representing the correlation at each 5-second interval of the previous 5 minutes of data?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!