I need to break down the flow data into groups of data with months for every year.

11 views (last 30 days)
For example: All the entries for the month of Jan for all the years (suppose 2010-2022) are stored in one group. Similarly, all the entries for the month of Feb for all the years.(suppose 2010-2022) are stored in second group and so on. My end goal is to calculate the percentile of monthly data for all the years.
Date Flow
12/1/2010 100
12/2/2010 200
12/3/2010 150
12/4/2010 176
.....
.....
12/1/2022 145
I tried to create the vlookup function to create the matrix but i wasn't able to. So, I decided to group all months together and then perform operations on them.

Accepted Answer

Cris LaPierre
Cris LaPierre on 20 Jun 2022
Look into groupsummary, specifically the groupbin option 'monthofyear'.
  6 Comments
Cris LaPierre
Cris LaPierre on 25 Jun 2022
I think the best approach here is to create a second table that is a subset of the full data set, but just contains the date ranges of interest. Then you can use groupsummary to compute the desired stats grouped by year. I don't know what you want to use for percentile, so I use 42.
T = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1043800/Example.xlsx',...
'ReadVariableNames',false)
T = 1826×2 table
Var1 Var2 ___________ ____ 01-Jan-2001 10 02-Jan-2001 20 03-Jan-2001 30 04-Jan-2001 40 05-Jan-2001 50 06-Jan-2001 60 07-Jan-2001 70 08-Jan-2001 80 09-Jan-2001 90 10-Jan-2001 100 11-Jan-2001 110 12-Jan-2001 120 13-Jan-2001 130 14-Jan-2001 140 15-Jan-2001 150 16-Jan-2001 160
% Use 'day of year' to identify dates within the desired range for all years
drng = day(datetime(2004,9,27):caldays(1):datetime(2004,10,27),'dayofyear');
d = day(T.Var1,'dayofyear');
% Create subtable
idx= ismember(d,drng);
Tsub = T(idx,:)
Tsub = 155×2 table
Var1 Var2 ___________ ____ 28-Sep-2001 2710 29-Sep-2001 2720 30-Sep-2001 2730 01-Oct-2001 2740 02-Oct-2001 2750 03-Oct-2001 2760 04-Oct-2001 2770 05-Oct-2001 2780 06-Oct-2001 2790 07-Oct-2001 2800 08-Oct-2001 2810 09-Oct-2001 2820 10-Oct-2001 2830 11-Oct-2001 2840 12-Oct-2001 2850 13-Oct-2001 2860
% compute stats for each year
Tstats = groupsummary(Tsub,'Var1','year',{@(A) prctile(A,42),'max','min'},'Var2');
Tstats.Properties.VariableNames = ["Tyr","GroupCnt","Tmin","Tmax","Tpct"]
Tstats = 5×5 table
Tyr GroupCnt Tmin Tmax Tpct ____ ________ ______ _____ _____ 2001 31 2835.2 3010 2710 2002 31 6485.2 6660 6360 2003 31 10135 10310 10010 2004 31 13785 13960 13660 2005 31 17445 17620 17320
Cris LaPierre
Cris LaPierre on 26 Jun 2022
Leap years are going to cause some issues with my previous answer. Unfortunately, I couldn't find a slick way to do this, so here's a brute force approach.
T = readtable('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1043800/Example.xlsx',...
'ReadVariableNames',false);
% Define range of interest
startD = datetime(2004,9,27);
stopD = datetime(2004,10,27);
% find data for all years between the indicated month/days
y = unique(year(T.Var1));
idx = zeros(height(T),1);
for i = 1:length(y)
idx = idx + isbetween(T.Var1,datetime(y(i),month(startD),day(startD)),...
datetime(y(i),month(stopD),day(stopD)));
end
Tsub = T(logical(idx),:)
Tsub = 155×2 table
Var1 Var2 ___________ ____ 27-Sep-2001 2700 28-Sep-2001 2710 29-Sep-2001 2720 30-Sep-2001 2730 01-Oct-2001 2740 02-Oct-2001 2750 03-Oct-2001 2760 04-Oct-2001 2770 05-Oct-2001 2780 06-Oct-2001 2790 07-Oct-2001 2800 08-Oct-2001 2810 09-Oct-2001 2820 10-Oct-2001 2830 11-Oct-2001 2840 12-Oct-2001 2850
% Compute summary stats
Tstats = groupsummary(Tsub,'Var1','year',{@(A) prctile(A,42),'max','min'},'Var2');
Tstats.Properties.VariableNames = ["Tyr","GroupCnt","Tmin","Tmax","Tpct"]
Tstats = 5×5 table
Tyr GroupCnt Tmin Tmax Tpct ____ ________ ______ _____ _____ 2001 31 2825.2 3000 2700 2002 31 6475.2 6650 6350 2003 31 10125 10300 10000 2004 31 13785 13960 13660 2005 31 17435 17610 17310

Sign in to comment.

More Answers (0)

Categories

Find more on Data Preprocessing in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!