Drop data from datatime by condition
1 view (last 30 days)
Show older comments
Hello,
i have a data along the time (see below). I would like to drop the data with the condition that the y-axes is equal zero during a specific duration (like 24 hours). How could i do this in the easiest way?
Thank you in advance
0 Comments
Accepted Answer
Mathieu NOE
on 1 Dec 2022
Edited: Mathieu NOE
on 1 Dec 2022
hello
try this
this code will remove data that are contiguous and within given bounds (here 24 hrs min and 36 hours max)
the results are storedin newtime,newdata
% dummy data
samples=15*24; % 15 days by 24 hours
time= (0:samples-1);
data = max(0,randn(size(time)));
% create some y = 0 zones
data(25:60) = 0;
data(125:145) = 0;
data(225:260) = 0;
%% parameters
min_contiguous_samples = 24; % store segments only if they are at least this length
max_contiguous_samples = 36; % store segments only if they are less than this length
threshold = eps; %
%% main loop %%%%
ind = (data<threshold); % select "almost" zero data (more robust than == 0)
% now define start en end point of "unvalid" segments
[begin,ends] = find_start_end_group(ind);
length_ind = ends - begin;
ind2 = (length_ind>=min_contiguous_samples) & (length_ind<=max_contiguous_samples); % check if their length is valid
begin = begin(ind2); % selected points
ends = ends(ind2); % selected points
% define unvalid segments to be removed latter
t= [];
d = [];
idx = [];
if ~isempty(begin)
for ci = 1:length(begin)
idx = [idx (begin(ci):ends(ci))];
t= [t time(idx)];
d = [d data(idx)];
end
end
figure(1),
plot(time,data,'k',t,d,'*r');
% we can remove the unwanted data
newtime = time;
newdata = data;
newtime(idx) = [];
newdata(idx) = [];
figure(2),
plot(time,data,'k',newtime,newdata,'*b');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [begin,ends] = find_start_end_group(ind)
% This locates the beginning /ending points of data groups
D = diff([0,ind,0]);
begin = find(D == 1);
ends = find(D == -1) - 1;
end
0 Comments
More Answers (1)
MarKf
on 1 Dec 2022
Now I see that this was already answered, but the y-values might need to equal zero *for* a specific duration (or more), if I get what they are asking, and they may still want to work with datetime arrays.
I see at the end of July that some y data was already removed, so maybe there is some need of some data sanitization too before.
Also you may want to use a threshold instead of Yvals==0 since I see some lone spikes (I see, already pointed out too).
XDates = datetime(2022,6,1):hours(1):datetime(2022,7,30);
Yvals = (randi(30,[1,numel(XDates)])>29)*300; Yvals([1,end])=300;
figure, plot (XDates,Yvals)
Ydiff = [0 diff(Yvals==0)];
Xdiff = [XDates(Ydiff==1);XDates(Ydiff==-1)];
XDiffs = XDates(Ydiff==-1)-XDates(Ydiff==1);
XDiff24 = XDiffs>hours(24);
XD2del = Xdiff(:,XDiff24);
XDates2delete = []; for xdi = 1:size(XD2del,2) XDates2delete = [XDates2delete XD2del(1,xdi):hours(1):XD2del(2,xdi)]; end
[XDates_new, idx] = setdiff(XDates,XDates2delete);
Yvals_new = Yvals(idx);
figure, plot (XDates_new,Yvals_new)
0 Comments
See Also
Categories
Find more on Calendar in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!