# Calculating duration in large dataset based on conditions

2 views (last 30 days)
Matt R on 17 Jun 2021
Answered: Divija Aleti on 22 Jun 2021
Hi,
I have a large dataset with 3 columns (datetime, Var1 & Var2). The data is over a month with a datapoint every 2 seconds.
I want to determine the duration when Var2 is above 1000 and the average value of Var1 when this is the case. The problem I have is not counting the duration from one true datapoint to the next true point (Var2 >1000).
Can someone help me with some code to determine this? Alternatively, a different way of determining the same thing would be appreciated.
Thanks!
##### 2 CommentsShowHide 1 older comment
Matt R on 17 Jun 2021
I'm not sure that's what I'm looking for. In other words, there are sections of my dataset when Var2 = 0 and other sections when Var2 >1000. The time duration when Var2>1000 is what I am looking for.
For example, in the 28 day period, Var2 > 1000 for X days, Y hours and Z mins.
I'm not sure how your suggestion would achieve that?
Thanks!

Divija Aleti on 22 Jun 2021
Hi Matt,
As far as I understand, the code given by Salah will help you solve your issue. I can try to explain it more clearly with a slight modification.
I created a small table with 3 columns, with a datapoint every 2 seconds, for explanation purposes. The first line, which is,
index = find(Var2 > 1000)
identifies the datapoints where Var2 > 1000. It can be seen clearly that the 1st, 4th and 5th datapoints have Var2 > 1000. Sure enough the output of the above line is: The second line (with a slight change), which is,
duration = 2*length(index)
first calculates the number of datapoints where Var2 > 1000, which is nothing but the number of elements in index (length(index)), which is 3. As each datapoint is of 2 seconds duration, by multiplying the obtained length with 2, we get the total duration in seconds. In this case it is 6 seconds.
As your dataset is over a period of one month, you will have to convert the obtained duration (in seconds) to days, hours and minutes using the corresponding conversion rates. (1 sec = (1/60) min = (1/3600) hours = (1/86400) days)
The third line, which is,
Var1_Avg = mean(Var1(index))
returns the average values of Var1 at those datapoints where Var2 > 1000. In this case, it is (100+250+300)/3 = 216.6667.
Hope this helps!
Regards,
Divija

R2020b

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!