9 views (last 30 days)

I have a N x 6 matrix and of interest are columns 1 and 5. Column 1 is time data in Excel's serial date format (e.g. 4.264400001156250e+04 = 01-Oct-2016 00:00:00) and Column 5 is Measurement data.

The time data spans across M days and has interval in seconds (not regular), so the value of N is quite large (~40,000).

What I would like to do is this: For a Measurement data on day X, calculate the standard deviation(σ) for that day based on the Measurement data for past Y days. How can I script this out? I can do this in Excel, but due to the sheer number of rows, it tends to get somewhat slow.

John Chilleri
on 19 Jan 2017

Hello,

I created the following code which should do as you wanted. An explanation of the code is below.

%

% function standard_deviations = pickfcnname(data, Y)

%

% Inputs:

%

% data N x 6 matrix where column 1 is time, column 5 is measurements

% Y Number of days (if available) to use in std calculation

%

% Outputs:

%

% stdevs column of length N containing associated standard deviations

%

function stdevs = pickfcnname(data, Y)

%

% Initialize

N = size(data,1);

stdevs = zeros(N,1);

%

% Note: the first few standard deviations will be computed from few points

% and will thus be weird.

stdevs(1) = 0; % NaN probably more truthful

stdevs(2) = 0;

%

Dates = floor(data(:,1));

Values = data(:,5);

for i = 3:N

%

% Acquire starting point

if (Dates(1) < Dates(i)-Y)

j = 1;

while (Dates(j) ~= Dates(i)-Y)

j = j+1;

end

lbound = j;

else

lbound = 1;

end

%

% Compute standard deviation from (starting point:previous point)

stdevs(i) = std(Values(lbound:i-1));

end

end

Explanation:

The code takes in the N x 6 data and an integer Y number of days (i.e. Y = 3 uses the past 3 days to compute the standard deviation).

The code sets the first two standard deviations to 0 as there aren't enough data points to compute a standard deviation. You should also recognize that the first few points after that have standard deviations made from few points.

The code then loops from the third to final data point, setting a lower bound lbound at Y days prior if possible, 1 otherwise (for initial stds). The standard deviations are then computed using the Y days ago to previous point's measurements and stored in the output vector stdevs, which has the same number of rows as your input data.

Hope this helps!

John Chilleri
on 23 Jan 2017

Yes, you can definitely start at 3 days delayed, which would require a check for which point indicates the fourth day.

As it is now, it computes the standard deviation based on what's available, so on day 3 it uses days 1 and 2 only since those are all that are available.

If you want to make it so no standard deviations are computed until day 4, we can change that pretty easily:

start = 1;

while (Dates(start) < Dates(1)+3)

start = start + 1;

end

for i = start:N

% etc

The above finds the first data point corresponding to day 4 and labels that as start, where the for loop can begin.

I believe that would do the job!

Opportunities for recent engineering grads.

Apply Today
## 2 Comments

## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/320963-calculating-moving-standard-deviation-bounds-for-time-series#comment_421348

⋮## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/320963-calculating-moving-standard-deviation-bounds-for-time-series#comment_421348

## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/320963-calculating-moving-standard-deviation-bounds-for-time-series#comment_421582

⋮## Direct link to this comment

https://se.mathworks.com/matlabcentral/answers/320963-calculating-moving-standard-deviation-bounds-for-time-series#comment_421582

Sign in to comment.