How can I check overlap between event markers?

Hello! I have a question concerning indexing and finding overlapping events within a time series. I have an array rain_data and an event_start/event_end array with the index of an event start and event end stored in the two index arrays.
I would like to exclude events that may have an overlap effect, which can take up to three days. For this, I therefore have to eliminate all events where 1. the end index at i + 3 days is larger then the start index at i+1; 2. the start index at i - 3 days is smaller than the end index at i-1.
The rain_data array consists of a first column in datenum with the respective times, and the second column is the rain data, although the rain data is irrelevant for this procedure.
Essentially, if I call up rain_data(event_start(i),1)), I get the starting time of an event, and the ending time can be drawn using rain_data(event_end(i),1)).
I have tried using different loops but I never end up getting the right eliminations in the indexes. Is there a more efficient way to do this?
An example of what I have tried:
rain_start=rain_data(start_event,1);
rain_end=rain_data(end_event,1);
for i=1:length(rain_start)
for k=1:length(rain_end)-1
if k>1
if rain_start(i)-3 > rain_end(k-1)
event_marker(i,:)=event_marker(i,:);
end
end
end
end
Where event_marker is simply the start and end index arrays put together. This doesn't work though.

5 Comments

It would help to see an example of a small amount of the data, it is hard to quite understand what format your data is in.
Also this line:
event_marker(i,:)=event_marker(i,:);
the only one in the nested for loops, does nothing at all so that surely isn't what you intended to write?
Actually yes, that was what I intended to write. It would only save the data back into event_marker if it does not overlap, i.e. there are more than three days between each event.
The rain_data:
736369.590277778 0
736369.590972222 0
736369.591666667 0
736369.592361111 0.660000000000000
736369.593055556 0.630000000000000
736369.593750000 0
736369.594444444 0.700000000000000
736369.595138889 0.280000000000000
736369.595833333 0.440000000000000
736369.596527778 0.270000000000000
736369.597222222 0
736369.597916667 0
736369.598611111 0
736369.599305556 0
736369.600000000 0
start_event (gives indexes in rain_data where an event starts):
830
4880
7086
8836
15665
20087
21743
28289
end_event (gives corresponding index of the end of the event in rain_data:
1196
5207
7387
9365
16539
20488
22977
29937
I simply put both the event_start and event_end index arrays together into event_marker to make the elimination simpler.
What is boggling me is that I have used a similar for-loop to to sort other things. If I change the event_marker(i,:)=event_marker(i,:) to load into another variable, I still don't eliminate events that are clearly within the 3 day overlap period of each other. Any help is greatly appreciated!
@Nina: Do you see that event_marker(i,:)=event_marker(i,:) is meaningless? It does nothing but assigning the contents of event_marker(i,:) to event_marker(i,:). No command in your posted code does something like "eliminating".
I still cannot follow your descriptions.
I simply put both the event_start and event_end index arrays
together into event_marker to make the elimination simpler.
What is "event_start"? I only find "rain_start" and "start_event".
I'm deeply confused. Please rephrase the question again as lean as possible. It does not matter, if the problem concerns rain. What exactly are the inputs? Can you post them such, that the readers can use them by copy&paste? What is the wanted output and which values should be "eliminated" - and what does "elimination" mean here (what is set to 0?)?
The rain series I have is about 900000 long, and the event_marker array is start_event in the first column and end_event in the second column. I cannot post the entire dataset here. I was mistaken during writing my response. start_event is the same as event_start. It was a typo. All I want to do is figure out a way to filter events that do not have a minimum of 3 days between each other. I stated that in the second paragraph of my question following the paragraph on my data format. I am aware my code does not work, but it also does not work if I try to save it to another variable than event_marker. That is why I am asking.
start_event and end_event store the indexes within rain_data at the time an event starts and ends.

Sign in to comment.

Answers (1)

According to your description, you would eliminate both overlapping events. I'm not sure why you could not just eliminate one and leaves the other one since it would no longer be overlapping.
Obviously, if event i overlaps with event i+1, then i+1 overlaps with i, so there's no point searching for both your 1) and your 2). Loops are not needed in any case:
rain_start=rain_data(start_event,1);
rain_end=rain_data(end_event,1);
isoverlap = rain_end(1:end-1) + 3 > rain_start(2:end); %find where events overlap
todelete = any([[isoverlap; false], [false; isoverlap]], 2); %true in 1st column: delete because it overlaps with next. true in 2nd column: delete because it overlaps with previous. Combine both columns with OR (any)
start_event(todelete) = []; %actually delete
end_event(todelete) = []; %start and end

Asked:

on 8 May 2017

Answered:

on 8 May 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!