MATLAB Answers

0

Comparing Duration Arrays is time consuming - How to improve my script?

Asked by Robin Schäfer on 4 Sep 2019
Latest activity Answered by Steven Lord
on 12 Sep 2019
EDIT 12.9.2019: Due to the first comments i tried to explain my problem more in detail. I will try your suggestions and response in a comment EDIT END
Hello alltogether
i got a problem on my script while comparing duration arrays which is taking a huge amount of time .
What I want to do with my Script: dataG is read form .xls-file and contains time information on start and end times ( format 2 x 1 : 03-Aug-2019 10:06:12). data is read from .csv-file and contains two time vectors (format 1 x n : 9:31:52.000 ...). The time vector from data fits into the time vector from dataG. I want to cut the array from data corresponding to the given start
Here some excerpts from my script:
dataG=readtable(X{posG}); % data with start and end time
...
data=readtable(N{1}); % continous time vector
...
tdur=table2array(dataG(posPG,7:8)); % is read as datetime format
[y, m, d]= ymd(tdur);
tdur=tdur-datetime(y,m,d,0,0,0); % conversion to duration format
clock=table2array(data(:,2)); % is read as duration format
try
for post1=1:length(clock) % Comparing durations - Here is where about 90% of the time is consumed
if tdur(1)<clock(post1)
break
end
end
for post2=post1:length(clock)
if tdur(2)<clock(post2) || isnan(clock(post2 + 1))
break
end
end
catch
end
data(post2:end,:)=[]; %cut data
data(1:post1,:)=[];
Matlab recognizes my variables automatically with 'readtable'.' tp' contains start time and end time.
In order to compare both variables I chose the conversion to duration arrays for both variables. For certain reasons, it is slowing down my whole script which are about 300 more lines and many for-loops and if-conditions. The function is called about 800 times, so time does matter..
The Profiler says, table2array is the most called function, 80-90 % of time on the first for-loop. I am pleased for any sugestion, to make my script faster! Thank you in advance =)
I think
Kind Regards
Rob

  3 Comments

I can’t follow what you’re doing.
Consider using table2timetable, then do what you want with the timetable.
If you describe at a higher level what you're trying to do with this code, we may be able to offer some guidance to help improve your code to accomplish your goal. Right now you've shown us a low-level view that makes me suspect you want to discretize your duration data, but I can't be certain.
@Steven Lord: You are completely right, i missed to tell you anything about what this script should do and what kind of data is available. I tried to make it more clear now!
@Star Strider: I've never used timetables before. I will give it a try! Thank you!

Sign in to comment.

2 Answers

Answer by Fabio Freschi on 4 Sep 2019
 Accepted Answer

Not sure if I understand correctly the problem without data. I try:
idx = find(clock > tdur(1),1,'first')

  2 Comments

Incredibly! Looks much better than my approach and is exorbitantly faster (12 min for my whole dataset compared to >3h before)! Solved my problem!
A last question: I use now:
post1=find(clock>tdur(1),1,'first');
post2=find(clock>tdur(2),1,'first');
Is it possible to use the find-function in one line and get an array (e.g. post 1 x 2 ) as result? I get only the first value when I use tdur without (1):
post=find(clock>tdur,1,'first');
Not with find. We can try to make the exercise, but I am not sure if is worth the hassle

Sign in to comment.


Answer by Steven Lord
on 12 Sep 2019

So you want to extract only those subset of rows of your data that fall into a certain time span? If so, and if you choose to store your data in a timetable array, use a timerange as your row index.
Depending on what you want to do with the data from that smaller time span, some of the other timetable related functions may be of use (I'm thinking specifically of retime or some of the data preprocessing functions like the grouping functions.)

  0 Comments

Sign in to comment.