Find unique groups in table with members both before and after date

1 view (last 30 days)
I have a table with strings in cells and a datetime. I need to find the unique groups of entries that occur before and after a particular date- June 1, 2022 in this case.
Here is my example table:
T=table({'Joe','Joe','Joe','Joe','Steve','Steve','Mary','Mary','Susan','Susan'}',...
{'2022-02-03','2021-12-23','2022-08-01','2022-07-15','2021-02-08',...
'2022-04-03','2022-09-03','2022-02-03','2022-08-13','2022-04-23'}',...
{'Car','Van','Car','Van','Car','Car','Truck','Truck','Car','Van'}',...
{'B213','G344','A342','G6767','N1356','A34','M235','A4567','G2345','A23'}',...
'VariableNames',{'Name' 'Date' 'Type' 'Data'}); % my data comes as cells from a database
T.Date=datetime(T.Date,'InputFormat','yyyy-MM-dd'); % Converting to datetime
T = convertvars(T,@iscell,'categorical'); % Convert to categorical - preferred to cells?
T
T = 10×4 table
Name Date Type Data _____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 23-Dec-2021 Van G344 Joe 01-Aug-2022 Car A342 Joe 15-Jul-2022 Van G6767 Steve 08-Feb-2021 Car N1356 Steve 03-Apr-2022 Car A34 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567 Susan 13-Aug-2022 Car G2345 Susan 23-Apr-2022 Van A23
Now I want to find the groups of matching Name and Type where at least one occurs after 01-Jun-2022 and at least on occurred before that date. I also need to carry the Data field with me. So the solution would be:
T_ans=T;
T_ans([5:6 9:10],:)=[];
sortrows(T_ans,[1 3])
ans = 6×4 table
Name Date Type Data ____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 01-Aug-2022 Car A342 Joe 23-Dec-2021 Van G344 Joe 15-Jul-2022 Van G6767 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567
With Steve elimanted becsue his matching Type dates were all before 01-Jun-2022, and Susan eliminated because she did not have 2 or more matching Types.
I'm not sure how to proceed- I am thinking of looping through all the unique 'Name' and 'Type' groups and seeing if there is one before and one after 01-Jun-2022, then listing all memebers if true.
I can;t figure out what to do after this though:
[a,b,c]=unique([T.Name,T.Type],'rows');
a
a = 6×2 categorical array
Joe Car Joe Van Mary Truck Steve Car Susan Car Susan Van
Thanks for any help.

Accepted Answer

Chunru
Chunru on 20 Sep 2022
Edited: Chunru on 20 Sep 2022
T=table({'Joe','Joe','Joe','Joe','Steve','Steve','Mary','Mary','Susan','Susan'}',...
{'2022-02-03','2021-12-23','2022-08-01','2022-07-15','2021-02-08',...
'2022-04-03','2022-09-03','2022-02-03','2022-08-13','2022-04-23'}',...
{'Car','Van','Car','Van','Car','Car','Truck','Truck','Car','Van'}',...
{'B213','G344','A342','G6767','N1356','A34','M235','A4567','G2345','A23'}',...
'VariableNames',{'Name' 'Date' 'Type' 'Data'}); % my data comes as cells from a database
T.Date=datetime(T.Date,'InputFormat','yyyy-MM-dd'); % Converting to datetime
T = convertvars(T,@iscell,'categorical'); % Convert to categorical - preferred to cells?
T
T = 10×4 table
Name Date Type Data _____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 23-Dec-2021 Van G344 Joe 01-Aug-2022 Car A342 Joe 15-Jul-2022 Van G6767 Steve 08-Feb-2021 Car N1356 Steve 03-Apr-2022 Car A34 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567 Susan 13-Aug-2022 Car G2345 Susan 23-Apr-2022 Van A23
uName = unique(T.Name);
uType = unique(T.Type);
idx = [];
for i=1:length(uName)
for j=1:length(uType)
T1 = T(T.Name == uName(i) & T.Type == uType(j), :);
if (sum(T1.Date > datetime("01-Jun-2022")) >= 1) && (sum(T1.Date < datetime("01-Jun-2022")) >=1)
idx =[idx; [i j]];
end
end
end
idx
idx = 3×2
1 1 1 3 2 2
Tout = [];
for i=1:size(idx, 1)
Tout = [Tout;
T((T.Name==uName(idx(i,1))) & (T.Type==uType(idx(i,2))), :)];
end
Tout
Tout = 6×4 table
Name Date Type Data ____ ___________ _____ _____ Joe 03-Feb-2022 Car B213 Joe 01-Aug-2022 Car A342 Joe 23-Dec-2021 Van G344 Joe 15-Jul-2022 Van G6767 Mary 03-Sep-2022 Truck M235 Mary 03-Feb-2022 Truck A4567
  3 Comments
Marcus Glover
Marcus Glover on 20 Sep 2022
Thanks!
In case anyone ever reads this with a similar issue, I did come up with a method using ismember, but this solution is much better.

Sign in to comment.

More Answers (1)

Lola Davidson
Lola Davidson on 3 Jun 2024
The groupfilter function was introduced in R2019b to simplify these types of workflows. Just specify which variables define the groups, which variable to send through a filter, and specify a filter function. Using groupfilter, you can do this in one function call.
myfilter = @(t)(sum(t > datetime("01-Jun-2022")) >= 1) && (sum(t < datetime("01-Jun-2022")) >=1);
T2 = groupfilter(T,["Name" "Type"],myfilter,"Date")

Categories

Find more on Tables in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!