Drop data from datatime by condition

2 views (last 30 days)
Hello,
i have a data along the time (see below). I would like to drop the data with the condition that the y-axes is equal zero during a specific duration (like 24 hours). How could i do this in the easiest way?
Thank you in advance

Accepted Answer

Mathieu NOE
Mathieu NOE on 1 Dec 2022
Edited: Mathieu NOE on 1 Dec 2022
hello
try this
this code will remove data that are contiguous and within given bounds (here 24 hrs min and 36 hours max)
the results are storedin newtime,newdata
% dummy data
samples=15*24; % 15 days by 24 hours
time= (0:samples-1);
data = max(0,randn(size(time)));
% create some y = 0 zones
data(25:60) = 0;
data(125:145) = 0;
data(225:260) = 0;
%% parameters
min_contiguous_samples = 24; % store segments only if they are at least this length
max_contiguous_samples = 36; % store segments only if they are less than this length
threshold = eps; %
%% main loop %%%%
ind = (data<threshold); % select "almost" zero data (more robust than == 0)
% now define start en end point of "unvalid" segments
[begin,ends] = find_start_end_group(ind);
length_ind = ends - begin;
ind2 = (length_ind>=min_contiguous_samples) & (length_ind<=max_contiguous_samples); % check if their length is valid
begin = begin(ind2); % selected points
ends = ends(ind2); % selected points
% define unvalid segments to be removed latter
t= [];
d = [];
idx = [];
if ~isempty(begin)
for ci = 1:length(begin)
idx = [idx (begin(ci):ends(ci))];
t= [t time(idx)];
d = [d data(idx)];
end
end
figure(1),
plot(time,data,'k',t,d,'*r');
% we can remove the unwanted data
newtime = time;
newdata = data;
newtime(idx) = [];
newdata(idx) = [];
figure(2),
plot(time,data,'k',newtime,newdata,'*b');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [begin,ends] = find_start_end_group(ind)
% This locates the beginning /ending points of data groups
D = diff([0,ind,0]);
begin = find(D == 1);
ends = find(D == -1) - 1;
end

More Answers (1)

MarKf
MarKf on 1 Dec 2022
Now I see that this was already answered, but the y-values might need to equal zero *for* a specific duration (or more), if I get what they are asking, and they may still want to work with datetime arrays.
I see at the end of July that some y data was already removed, so maybe there is some need of some data sanitization too before.
Also you may want to use a threshold instead of Yvals==0 since I see some lone spikes (I see, already pointed out too).
XDates = datetime(2022,6,1):hours(1):datetime(2022,7,30);
Yvals = (randi(30,[1,numel(XDates)])>29)*300; Yvals([1,end])=300;
figure, plot (XDates,Yvals)
Ydiff = [0 diff(Yvals==0)];
Xdiff = [XDates(Ydiff==1);XDates(Ydiff==-1)];
XDiffs = XDates(Ydiff==-1)-XDates(Ydiff==1);
XDiff24 = XDiffs>hours(24);
XD2del = Xdiff(:,XDiff24);
XDates2delete = []; for xdi = 1:size(XD2del,2) XDates2delete = [XDates2delete XD2del(1,xdi):hours(1):XD2del(2,xdi)]; end
[XDates_new, idx] = setdiff(XDates,XDates2delete);
Yvals_new = Yvals(idx);
figure, plot (XDates_new,Yvals_new)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!