You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Summing elements in an array between uneven date intervals
1 view (last 30 days)
Show older comments
I have a dataset of daily rainfall. I also have a list of 28 days on which environmental sampling was carried out. I'm trying to figure out how to sum the total precipitation that fell between each environmental sampling (represented by "sample_dates")
The environmental samplings are mostly about 30 days apart, but not exactly. I already know how to find the monthly precipitation, but what I want is the exact amount of rainfall that fell between each sampling.
Any advice on what functions to use would be so appreciated. I've read documentation for cumsum, ismember, movsum, and a bunch of previously asked questions but I can't find anything that's helped me come up with a solution.
T = readtable("precip_test.csv", "VariableNamingRule","preserve");
A = table2array(T);
%sequential dates from Sept 2019 to Dec 2022
rain_dates = A(:,1);
%daily precipitation in mm for each day
rain_mm = A(:,2);
%dates that environmental samples were taken
sample_dates = A(:,3);
sample_dates(any(isnan(sample_dates), 2), :) = [];
Accepted Answer
Star Strider
on 6 Oct 2023
Edited: Star Strider
on 6 Oct 2023
I am not certain what you want. If I am readiing the file correctly and converting the dates correctly (both 'excel' and 'posixtime' give reasonable results for the dates, however the ‘Date Time’ and ‘Sample Dates’ do not make sense with 'excel') the ‘Sample Dates’ seem to give appropriate results with ‘Precip (mm)’ so it would seem that with that we are finished and nothing further need be done.
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
T2 = table(DateTime, T.('Precip (mm)'), SampleDates, 'VariableNames',{'Date Time','Precip (mm)','Sample Dates'})
T2 = rmmissing(T2); % Remove 'NaT' Rows
LastLine = T2(end,:)
VN = T2.Properties.VariableNames;
figure
stairs(T2.('Date Time'), T2.('Precip (mm)'))
grid
xlabel(VN{3})
ylabel(VN{2})
return % Stop Here
A = table2array(T);
%sequential dates from Sept 2019 to Dec 2022
rain_dates = A(:,1);
%daily precipitation in mm for each day
rain_mm = A(:,2);
%dates that environmental samples were taken
sample_dates = A(:,3);
sample_dates(any(isnan(sample_dates), 2), :) = [];
What else do you want to do with these resullts?
.
15 Comments
Emily
on 6 Oct 2023
Dear Star Strider,
Thank you for your reply. I should have been clearer, the dates are in excel datenumber format.
The end result that I am after is the sum of precipitation (measured daily – that's what the long list of dates is for) that fell between each sampling date (the shorter list of dates, they are all contained on the long list of dates).
in other words, in case that's still not clear:
every day we measure how much it rains. every approximately (not exactly) ~30 days we take a sample. how much rain falls in total between the samples? the sample interval is uneven, so that's why I'm having issues figuring out how to approach this.
thanks again!
Star Strider
on 6 Oct 2023
Edited: Star Strider
on 6 Oct 2023
O.K. I changed that in my earlier code, so they now convert from 'excel' however now the 'Sample Dates' do not make sense in the context of 'Date Time'.
What do we do with the date information?
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
T = 1338×3 table
Date Precip (mm) Sample dates
_____ ___________ ____________
43709 6.7 43773
43710 0 43802
43711 0 43838
43712 0.4 43865
43713 15.9 43899
43714 5 43927
43715 4.8 43959
43716 2 43984
43717 0.1 43984
43718 0 44018
43719 0 44075
43720 0 44102
43721 0 44144
43722 0.4 44166
43723 4.6 44204
43724 12.8 44235
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
T2 = table(DateTime, T.('Precip (mm)'), SampleDates, 'VariableNames',{'Date Time','Precip (mm)','Sample Dates'})
T2 = 1338×3 table
Date Time Precip (mm) Sample Dates
___________ ___________ ____________
01-Sep-2019 6.7 04-Nov-2019
02-Sep-2019 0 03-Dec-2019
03-Sep-2019 0 08-Jan-2020
04-Sep-2019 0.4 04-Feb-2020
05-Sep-2019 15.9 09-Mar-2020
06-Sep-2019 5 06-Apr-2020
07-Sep-2019 4.8 08-May-2020
08-Sep-2019 2 02-Jun-2020
09-Sep-2019 0.1 02-Jun-2020
10-Sep-2019 0 06-Jul-2020
11-Sep-2019 0 01-Sep-2020
12-Sep-2019 0 28-Sep-2020
13-Sep-2019 0 09-Nov-2020
14-Sep-2019 0.4 01-Dec-2020
15-Sep-2019 4.6 08-Jan-2021
16-Sep-2019 12.8 08-Feb-2021
T2 = rmmissing(T2); % Remove 'NaT' Rows
LastLine = T2(end,:)
LastLine = 1×3 table
Date Time Precip (mm) Sample Dates
___________ ___________ ____________
28-Sep-2019 0 07-Jun-2022
VN = T2.Properties.VariableNames;
figure
stairs(T2.('Date Time'), T2.('Precip (mm)'))
grid
xlabel(VN{3})
ylabel(VN{2})
TT = table2timetable(T2(:,[1 2]))
TT = 28×1 timetable
Date Time Precip (mm)
___________ ___________
01-Sep-2019 6.7
02-Sep-2019 0
03-Sep-2019 0
04-Sep-2019 0.4
05-Sep-2019 15.9
06-Sep-2019 5
07-Sep-2019 4.8
08-Sep-2019 2
09-Sep-2019 0.1
10-Sep-2019 0
11-Sep-2019 0
12-Sep-2019 0
13-Sep-2019 0
14-Sep-2019 0.4
15-Sep-2019 4.6
16-Sep-2019 12.8
TTrainfall = retime(TT, 'monthly','sum')
TTrainfall = timetable
Date Time Precip (mm)
___________ ___________
01-Sep-2019 80
% return % Stop Here
%
% A = table2array(T);
%
% %sequential dates from Sept 2019 to Dec 2022
% rain_dates = A(:,1);
%
% %daily precipitation in mm for each day
% rain_mm = A(:,2);
%
% %dates that environmental samples were taken
% sample_dates = A(:,3);
% sample_dates(any(isnan(sample_dates), 2), :) = [];
Since there is only one month, there is only one result, showing that for the data available, September 2019 had 80 mm of liquid precipitation.
.
Emily
on 6 Oct 2023
Edited: Emily
on 6 Oct 2023
I tried your code and everything is loaded correctely.
So, now I want to sum up precipitation from between the first two 'Sample Dates' in the list, which are:
'04-Nov-2019'
'03-Dec-2019'
'DateTime' has dates for every single day, this date range from the 'Sample Dates' is represented by
DateTime(65,1) through DateTime(94,1)
My goal is to sum up the precipitation corresponding to the date range.
The precip corresponding to the these ranges is represented as:
T.('Precip (mm)')(65,1) through T.('Precip (mm)')(94,1)
Then I want to repeat this process for all of the dates in the 'Sample Dates' list
Does this make sense?
Thank you for your continued help.
Emily
on 6 Oct 2023
I cannot use the retime function to resample at the 1 month interval, because what I am trying to calculate is not exactly the same as monthly precipitation. I need to calculate the precipitation between each sampling date which is never exactly 1 month, it varies each time. So I need to sum over a custom interval using the sampling dates as inputs.
Star Strider
on 6 Oct 2023
I do not understand the sampling dates. They make no sense with respect to the first column dates.
How are they supposed to be used?
Emily
on 6 Oct 2023
These are the sample dates:
'04-Nov-2019'
'03-Dec-2019'
'08-Jan-2020'
'04-Feb-2020'
'09-Mar-2020'
'06-Apr-2020'
'08-May-2020'
'02-Jun-2020'
'02-Jun-2020'
'06-Jul-2020'
'01-Sep-2020'
'28-Sep-2020'
'09-Nov-2020'
'01-Dec-2020'
'08-Jan-2021'
'08-Feb-2021'
'08-Mar-2021'
'09-Apr-2021'
'05-Jun-2021'
'08-Jul-2021'
'06-Sep-2021'
'01-Nov-2021'
'08-Dec-2021'
'06-Jan-2022'
'04-Mar-2022'
'01-Apr-2022'
'04-May-2022'
'07-Jun-2022'
They are all within the DateTime dates (daily dates from Sept. 2019 - Nov 2023)
The sample dates are needed to create the intervals over which I need to sum precipitation (measured daily).
For example I want to sum up daily precip from:
'04-Nov-2019' to '03-Dec-2019'
'03-Dec-2019' to '08-Jan-2020'
'08-Jan-2020' to '04-Feb-2020'
...
'04-May-2022' to '07-Jun-2022'
The reason I can't just use monthly precip is because the amount of time between these intervals is slightly different.
Hopefully this provides some clarity.
Emily
on 6 Oct 2023
In case its not clear, the daily precipitation corresponds to the DateTime dates. The Sampling Dates is the list of intervals over which to sum.
Star Strider
on 6 Oct 2023
The file contains only data for September 2019, and the first sampling interval is betweeen 4 November and 3 December 2019.
That makes absolutely no sense.
Emily
on 6 Oct 2023
I see the issue. Your T2 has truncated the dates. Just use this and you should be able to see that the file contains data from Sept 2019 - Nov 2023.
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
Star Strider
on 6 Oct 2023
Edited: Star Strider
on 7 Oct 2023
The dates are truncated because several of the 'Sample Dates' column were listed as NaT.
Separating out 'Sample Dates' to a separate table permits logical indexing based on its elements.
Try this —
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
T = 1338×3 table
Date Precip (mm) Sample dates
_____ ___________ ____________
43709 6.7 43773
43710 0 43802
43711 0 43838
43712 0.4 43865
43713 15.9 43899
43714 5 43927
43715 4.8 43959
43716 2 43984
43717 0.1 43984
43718 0 44018
43719 0 44075
43720 0 44102
43721 0 44144
43722 0.4 44166
43723 4.6 44204
43724 12.8 44235
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
T2 = table(DateTime, T.('Precip (mm)'), SampleDates, 'VariableNames',{'Date Time','Precip (mm)','Sample Dates'})
T2 = 1338×3 table
Date Time Precip (mm) Sample Dates
___________ ___________ ____________
01-Sep-2019 6.7 04-Nov-2019
02-Sep-2019 0 03-Dec-2019
03-Sep-2019 0 08-Jan-2020
04-Sep-2019 0.4 04-Feb-2020
05-Sep-2019 15.9 09-Mar-2020
06-Sep-2019 5 06-Apr-2020
07-Sep-2019 4.8 08-May-2020
08-Sep-2019 2 02-Jun-2020
09-Sep-2019 0.1 02-Jun-2020
10-Sep-2019 0 06-Jul-2020
11-Sep-2019 0 01-Sep-2020
12-Sep-2019 0 28-Sep-2020
13-Sep-2019 0 09-Nov-2020
14-Sep-2019 0.4 01-Dec-2020
15-Sep-2019 4.6 08-Jan-2021
16-Sep-2019 12.8 08-Feb-2021
VN = T2.Properties.VariableNames;
T3 = T2(:,3); % Separate 'Sample Dates'
T3 = rmmissing(T3)
T3 = 28×1 table
Sample Dates
____________
04-Nov-2019
03-Dec-2019
08-Jan-2020
04-Feb-2020
09-Mar-2020
06-Apr-2020
08-May-2020
02-Jun-2020
02-Jun-2020
06-Jul-2020
01-Sep-2020
28-Sep-2020
09-Nov-2020
01-Dec-2020
08-Jan-2021
08-Feb-2021
T2 = T2(:,[1 2])
T2 = 1338×2 table
Date Time Precip (mm)
___________ ___________
01-Sep-2019 6.7
02-Sep-2019 0
03-Sep-2019 0
04-Sep-2019 0.4
05-Sep-2019 15.9
06-Sep-2019 5
07-Sep-2019 4.8
08-Sep-2019 2
09-Sep-2019 0.1
10-Sep-2019 0
11-Sep-2019 0
12-Sep-2019 0
13-Sep-2019 0
14-Sep-2019 0.4
15-Sep-2019 4.6
16-Sep-2019 12.8
LastLine = T2(end,:)
LastLine = 1×2 table
Date Time Precip (mm)
___________ ___________
30-Nov-2023 0
PrecipStats = table('Size',[size(T3,1)-1,3], 'VariableTypes',{'datetime','datetime','double'});
for k = 1:size(T3,1)-1
Lv = T2{:,1} >= T3{k,1} & T2{:,1} < T3{k+1,1};
PrecipStats(k,:) = {T3{k,1}, T3{k+1,1}, sum(T2{Lv,2})};
end
PrecipStats.Properties.VariableNames = {'Start Date','End Date','Precip Total'}
PrecipStats = 27×3 table
Start Date End Date Precip Total
___________ ___________ ____________
04-Nov-2019 03-Dec-2019 48.6
03-Dec-2019 08-Jan-2020 103.6
08-Jan-2020 04-Feb-2020 119
04-Feb-2020 09-Mar-2020 47.9
09-Mar-2020 06-Apr-2020 44
06-Apr-2020 08-May-2020 71.1
08-May-2020 02-Jun-2020 56.8
02-Jun-2020 02-Jun-2020 0
02-Jun-2020 06-Jul-2020 99
06-Jul-2020 01-Sep-2020 115
01-Sep-2020 28-Sep-2020 73.1
28-Sep-2020 09-Nov-2020 97.1
09-Nov-2020 01-Dec-2020 66.6
01-Dec-2020 08-Jan-2021 155
08-Jan-2021 08-Feb-2021 68.2
08-Feb-2021 08-Mar-2021 65.4
LastLine = PrecipStats(end,:)
LastLine = 1×3 table
Start Date End Date Precip Total
___________ ___________ ____________
04-May-2022 07-Jun-2022 61.9
figure
stairs(T2.('Date Time'), T2.('Precip (mm)'), 'DisplayName',VN{2})
hold on
stairs(PrecipStats.('Start Date'), PrecipStats.('Precip Total'), 'DisplayName','Interim Totals (mm)')
hold off
grid
xlabel(VN{3})
ylabel(VN{2})
legend('Location','NW')
.
Emily
on 7 Oct 2023
Yes! this is it!
Can I ask a clarifying question so i better understand your approach ––
What is the purpose of this line of code?
Lv = T2{:,1} >= T3{k,1} & T2{:,1} < T3{k+1,1}
The for loop approach intuitively makes a lot of sense to me but I want to understand the nuances clearly so I could do it on my own
Thank you for your patience and help. much appreciated.
Star Strider
on 7 Oct 2023
My pleasure!
That line creates the logical vector, ‘Lv’. (I could have used find, to determine the numerical indices, and while that would work and in some situations is necessary, it is unnecessary here and inefficient to do the same operation, that being to find the rows in ‘T2’ that meet that specific condition.)
A somewhat simpler and more compact illustration —
T9 = array2table([1:10; randn(1,10)]')
T9 = 10×2 table
Var1 Var2
____ ________
1 -1.3207
2 -0.4068
3 -0.93179
4 -0.89652
5 -0.45122
6 0.54392
7 -0.61332
8 1.8858
9 1.3137
10 0.55674
Lv = T9{:,1} >= 3 & T9{:,1} < 7
Lv = 10×1 logical array
0
0
1
1
1
1
0
0
0
0
Out = T9(Lv,:)
Out = 4×2 table
Var1 Var2
____ ________
3 -0.93179
4 -0.89652
5 -0.45122
6 0.54392
That’s essentially all there is to it.
.
Emily
on 7 Oct 2023
thank you! this makes a lot of sense and after reading through the documentation and looking at the examples I feel like i actually understand it, which is awesome. much appreciated :)
More Answers (0)
See Also
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)