Calculating duration in large dataset based on conditions

Question

Matt R on 17 Jun 2021

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/858745-calculating-duration-in-large-dataset-based-on-conditions

Answered: Divija Aleti on 22 Jun 2021

Hi,

I have a large dataset with 3 columns (datetime, Var1 & Var2). The data is over a month with a datapoint every 2 seconds.

I want to determine the duration when Var2 is above 1000 and the average value of Var1 when this is the case. The problem I have is not counting the duration from one true datapoint to the next true point (Var2 >1000).

Can someone help me with some code to determine this? Alternatively, a different way of determining the same thing would be appreciated.

Thanks!

2 Comments
Show NoneHide None

SALAH ALRABEEI on 17 Jun 2021

Open in MATLAB Online

Do you mean the duration ( sum of all datapoints corresponding to the Var2>2!)

Assuming I got what you mean!,

index = find(Var2 > 1000);
duration = sum(2* length(index)); % length(index) is the number of points where each points is two seconds
Var1_Avg = mean(Var1(index));

Matt R on 17 Jun 2021

I'm not sure that's what I'm looking for. In other words, there are sections of my dataset when Var2 = 0 and other sections when Var2 >1000. The time duration when Var2>1000 is what I am looking for.

For example, in the 28 day period, Var2 > 1000 for X days, Y hours and Z mins.

I'm not sure how your suggestion would achieve that?

Thanks!

Sign in to comment.

Sign in to answer this question.

Answer 1

Divija Aleti on 22 Jun 2021

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/858745-calculating-duration-in-large-dataset-based-on-conditions#answer_730620

Open in MATLAB Online

Hi Matt,

As far as I understand, the code given by Salah will help you solve your issue. I can try to explain it more clearly with a slight modification.

I created a small table with 3 columns, with a datapoint every 2 seconds, for explanation purposes.

The first line, which is,

index = find(Var2 > 1000)

identifies the datapoints where Var2 > 1000. It can be seen clearly that the 1st, 4th and 5th datapoints have Var2 > 1000. Sure enough the output of the above line is:

The second line (with a slight change), which is,

duration = 2*length(index)

first calculates the number of datapoints where Var2 > 1000, which is nothing but the number of elements in index (length(index)), which is 3. As each datapoint is of 2 seconds duration, by multiplying the obtained length with 2, we get the total duration in seconds. In this case it is 6 seconds.

As your dataset is over a period of one month, you will have to convert the obtained duration (in seconds) to days, hours and minutes using the corresponding conversion rates. (1 sec = (1/60) min = (1/3600) hours = (1/86400) days)

The third line, which is,

Var1_Avg = mean(Var1(index))

returns the average values of Var1 at those datapoints where Var2 > 1000. In this case, it is (100+250+300)/3 = 216.6667.

Calculating duration in large dataset based on conditions

2 Comments
Show NoneHide None

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Calculating duration in large dataset based on conditions

2 Comments Show NoneHide None

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

0 Comments
Show -2 older commentsHide -2 older comments