How to conditionally and by groups subtract one row from another in table?
8 views (last 30 days)
Show older comments
Hi
I want to add new rows in the table below. The new rows should be equal to 'Total energy supply' minus 'Total final consumption' for each country in the table.
Normally, I would use varfun() to find the sum of groups. But I can't figure out how to apply varfun() conditionally, or what do to when substracting instead of adding two rows.
Here is some code that reproduces the table:
country = categorical(["USA";"USA";"USA"; "Canada"; "Canada"; "Canada"]);
sector = categorical(["supply"; "consumption"; "other";"supply"; "consumption"; "other"]);
coal = [1;2;3;4;5;6];
oil = [1;2;3;4;5;6];
T = table(country, sector, coal, oil)
Any help is much appreciated.
Thanks.
2 Comments
Accepted Answer
Jon
on 4 May 2022
Is this what you want:
country = categorical(["USA";"USA";"USA"; "Canada"; "Canada"; "Canada"]);
sector = categorical(["supply"; "consumption"; "other";"supply"; "consumption"; "other"]);
coal = [1;2;3;4;5;6];
oil = [1;2;3;4;5;6];
T = table(country, sector, coal, oil)
% Find the available countries
countries = unique(T.country)
% Loop through the available countries computing the net consumption for
% each
endRow = size(T,1); % current ending row number
for k = 1:numel(countries)
% sum the supplies and consumption
idl = T.sector == "supply" & T.country == countries(k);
totalSupply = sum(T{idl,3:end},1);
idl = T.sector == "consumption" & T.country == countries(k);
totalConsumption = sum(T{idl,3:end},1);
% compute net
net = totalSupply -totalConsumption;
% and add row
endRow = endRow + 1;
newRow = T(find(idl),:); % base on any of the current matching rows
newRow.sector = "net";
newRow{1,3:end} = net;
T(endRow,:) = newRow;
end
T =
8×4 table
country sector coal oil
_______ ___________ ____ ___
USA supply 1 1
USA consumption 2 2
USA other 3 3
Canada supply 4 4
Canada consumption 5 5
Canada other 6 6
Canada net -1 -1
USA net -1 -1
2 Comments
Jon
on 4 May 2022
Glad it is working. I don't have any experience using categorical variables. So, maybe there is some other more elegant way to do this using their special functionality, but at least this is one approach.
More Answers (1)
Cris LaPierre
on 4 May 2022
Edited: Cris LaPierre
on 4 May 2022
I think I would calculate the results to a new table using groupsummary, then concatenate the tables. This is a little longer than I was hoping, but figured I'd address the smaller issues. Once it's automated, it doesn't really matter.
country = categorical(["USA";"USA";"USA"; "Canada"; "Canada"; "Canada"]);
sector = categorical(["supply"; "consumption"; "other";"supply"; "consumption"; "other"]);
coal = [1;2;3;4;5;6];
oil = [1;2;3;4;5;6];
T = table(country, sector, coal, oil)
% Change consumption values to negative
newT = T;
ind = newT.sector == "consumption";
newT.coal(ind) = -newT.coal(ind);
newT.oil(ind) = -newT.oil(ind)
% Remove 'other'
ind = newT.sector == "other";
newT(ind,:) = [];
% calculate net supply by country
net = groupsummary(newT,"country",'sum',["coal","oil"])
% format table
net = removevars(net,"GroupCount");
net.sector(:) = categorical("net supply");
net.Properties.VariableNames(2:3) = ["coal","oil"]
% combine tables
T = [T;net];
T = sortrows(T,"country")
2 Comments
Jon
on 4 May 2022
@Cris LaPierre I liked learning from your example how groupsummary could be applied for this kind of problem. I can see that functionality can be quite powerful. Thanks!
See Also
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!