grouping/binning by time

11 views (last 30 days)
Ketan Bharucha
Ketan Bharucha on 1 Jun 2023
Have the following table:
Time address
'14:49:56.421875000' 'ccccd9'
'14:49:56.421875000' 'cccccb'
'14:49:56.429687500' 'cccce0'
'14:49:56.421875000' 'ccccd8'
'14:49:56.421875000' 'ccccc5'
'14:49:56.421875000' 'cccccd'
'14:49:56.421875000' 'cccccc'
'14:49:56.421875000' 'ccccde'
'14:49:56.421875000' 'ccccd5'
'14:49:56.421875000' 'ccccd0'
'14:49:56.421875000' 'ccccdd'
'14:49:56.421875000' 'ccccd3'
'14:49:56.421875000' 'ccccd2'
'14:49:56.437500000' 'cccd03'
'14:49:56.429687500' 'ccccf3'
'14:49:56.429687500' 'ccccf6'
'14:49:56.429687500' 'cccce5'
'14:49:56.437500000' 'cccd0d'
'14:49:56.437500000' 'cccd18'
'14:49:56.437500000' 'cccd15'
'14:49:56.437500000' 'cccd11'
'14:49:56.437500000' 'cccd07'
'14:49:56.437500000' 'cccd0f'
'14:49:56.437500000' 'cccd10'
'14:49:56.437500000' 'cccd09'
'14:49:56.437500000' 'cccd0c'
'14:49:56.437500000' 'cccd0e'
'14:49:56.437500000' 'cccd12'
'14:49:56.437500000' 'cccd17'
'14:49:56.437500000' 'cccd0b'
'14:49:56.437500000' 'cccd0a'
'14:49:56.437500000' 'cccd16'
'14:49:56.437500000' 'cccd08'
'14:49:56.437500000' 'cccd13'
How can I bin and return number of 'address' in a given second?
thanks for your help
KB

Accepted Answer

dpb
dpb on 1 Jun 2023
data={
'14:49:56.421875000' 'ccccd9'
'14:49:56.421875000' 'cccccb'
'14:49:56.429687500' 'cccce0'
'14:49:56.421875000' 'ccccd8'
'14:49:56.421875000' 'ccccc5'
'14:49:56.421875000' 'cccccd'
'14:49:56.421875000' 'cccccc'
'14:49:56.421875000' 'ccccde'
'14:49:56.421875000' 'ccccd5'
'14:49:56.421875000' 'ccccd0'
'14:49:56.421875000' 'ccccdd'
'14:49:56.421875000' 'ccccd3'
'14:49:56.421875000' 'ccccd2'
'14:49:56.437500000' 'cccd03'
'14:49:56.429687500' 'ccccf3'
'14:49:56.429687500' 'ccccf6'
'14:49:56.429687500' 'cccce5'
'14:49:56.437500000' 'cccd0d'
'14:49:56.437500000' 'cccd18'
'14:49:56.437500000' 'cccd15'
'14:49:56.437500000' 'cccd11'
'14:49:56.437500000' 'cccd07'
'14:49:56.437500000' 'cccd0f'
'14:49:56.437500000' 'cccd10'
'14:49:56.437500000' 'cccd09'
'14:49:56.437500000' 'cccd0c'
'14:49:56.437500000' 'cccd0e'
'14:49:56.437500000' 'cccd12'
'14:49:56.437500000' 'cccd17'
'14:49:56.437500000' 'cccd0b'
'14:49:56.437500000' 'cccd0a'
'14:49:56.437500000' 'cccd16'
'14:49:56.437500000' 'cccd08'
'14:49:56.437500000' 'cccd13'};
Time=duration(data(:,1),'InputFormat','hh:mm:ss.SSSSSSSSS','Format','hh:mm:ss.SSSSSS');
Address=categorical(data(:,2));
tT=table(Time,Address);
[h,m,s]=hms(tT.Time);
tT=addvars(tT,duration(h,m,s),'NewVariableNames',{'Second'},'After','Time');
head(tT)
Time Second Address _______________ ________ _______ 14:49:56.421875 14:49:56 ccccd9 14:49:56.421875 14:49:56 cccccb 14:49:56.429687 14:49:56 cccce0 14:49:56.421875 14:49:56 ccccd8 14:49:56.421875 14:49:56 ccccc5 14:49:56.421875 14:49:56 cccccd 14:49:56.421875 14:49:56 cccccc 14:49:56.421875 14:49:56 ccccde
groupsummary(tT,{'Second','Address'})
ans = 34×3 table
Second Address GroupCount ________ _______ __________ 14:49:56 ccccc5 1 14:49:56 cccccb 1 14:49:56 cccccc 1 14:49:56 cccccd 1 14:49:56 ccccd0 1 14:49:56 ccccd2 1 14:49:56 ccccd3 1 14:49:56 ccccd5 1 14:49:56 ccccd8 1 14:49:56 ccccd9 1 14:49:56 ccccdd 1 14:49:56 ccccde 1 14:49:56 cccce0 1 14:49:56 cccce5 1 14:49:56 ccccf3 1 14:49:56 ccccf6 1
numel(unique(tT.Address))==height(tT)
ans = logical
1
So your example dataset isn't very interesting as no address is repeated -- of course, it doesn't cover more than one second, either...
  4 Comments
dpb
dpb on 1 Jun 2023
You imported it as a datenum, from the default date added to the time. Multiple choices at this point, you can either fix the import step to read as duration for the above to work as written; I assumed from your initial post you had a string variable.
Alternatively, two other ways to go -- if the bogus date doesn't offend your sensibilites, then use
T2=renamevars(T2,{'Var1','Var2'},{'Date','Address'}); % get us some useful names
T2=addvars(T2,dateshift(T2.Date),'start','second'),'NewVariableNames',{'Second'},'After','Date');
groupsummary(T2,{'Second','Address'})
and you don't need the specific duration; I didn't do this originally because figured didn't want the date around. But, other than visual, it won't really hurt anything and if you set the .Format property to only show the time section, it'll be transparent.
Alternatively, you can convert to a duration by using
T2.Date=timeofday(T2.Date);
and continue on from there; Date is then a duration. It would be simpler this route to do the dateshift first, then timeofday() on it to have the rounded values to group by.
Oh! So many ways to skin the proverbial quadraped...
Ketan Bharucha
Ketan Bharucha on 5 Jun 2023
Thank you. That worked.

Sign in to comment.

More Answers (1)

Siddharth Bhutiya
Siddharth Bhutiya on 6 Jun 2023
If you read this in as a timetable, you can simply use retime to count the number of addresses in a given second. Starting off with dpb's code above.
data={
'14:49:56.421875000' 'ccccd9'
'14:49:56.421875000' 'cccccb'
'14:49:56.429687500' 'cccce0'
'14:49:56.421875000' 'ccccd8'
'14:49:56.421875000' 'ccccc5'
'14:49:56.421875000' 'cccccd'
'14:49:56.421875000' 'cccccc'
'14:49:56.421875000' 'ccccde'
'14:49:56.421875000' 'ccccd5'
'14:49:56.421875000' 'ccccd0'
'14:49:56.421875000' 'ccccdd'
'14:49:56.421875000' 'ccccd3'
'14:49:56.421875000' 'ccccd2'
'14:49:56.437500000' 'cccd03'
'14:49:56.429687500' 'ccccf3'
'14:49:56.429687500' 'ccccf6'
'14:49:56.429687500' 'cccce5'
'14:49:56.437500000' 'cccd0d'
'14:49:56.437500000' 'cccd18'
'14:49:56.437500000' 'cccd15'
'14:49:56.437500000' 'cccd11'
'14:49:56.437500000' 'cccd07'
'14:49:56.437500000' 'cccd0f'
'14:49:56.437500000' 'cccd10'
'14:49:56.437500000' 'cccd09'
'14:49:56.437500000' 'cccd0c'
'14:49:56.437500000' 'cccd0e'
'14:49:56.437500000' 'cccd12'
'14:49:56.437500000' 'cccd17'
'14:49:56.437500000' 'cccd0b'
'14:49:56.437500000' 'cccd0a'
'14:49:56.437500000' 'cccd16'
'14:49:56.437500000' 'cccd08'
'14:49:56.437500000' 'cccd13'};
Time=duration(data(:,1),'InputFormat','hh:mm:ss.SSSSSSSSS','Format','hh:mm:ss.SSSSSS');
Address=categorical(data(:,2));
From this point create a timetable and then call retime on it to resample it to seconds and count the number of entries in each bin
tt = timetable(Time,Address)
tt = 34×1 timetable
Time Address _______________ _______ 14:49:56.421875 ccccd9 14:49:56.421875 cccccb 14:49:56.429687 cccce0 14:49:56.421875 ccccd8 14:49:56.421875 ccccc5 14:49:56.421875 cccccd 14:49:56.421875 cccccc 14:49:56.421875 ccccde 14:49:56.421875 ccccd5 14:49:56.421875 ccccd0 14:49:56.421875 ccccdd 14:49:56.421875 ccccd3 14:49:56.421875 ccccd2 14:49:56.437500 cccd03 14:49:56.429687 ccccf3 14:49:56.429687 ccccf6
retime(tt,"secondly","count")
ans = timetable
Time Address _______________ _______ 14:49:56.000000 34

Tags

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!