How to separate table columns by groups?

If I have a table of data (T1) such as:
Var1 Var2
1 xyz
1 xyy
2 xxx
2 xzx
3 xyy
4 yzy
4 xzz
4 yzz
I would like to output another table (T2):
Var1 Var2
1 xyz,xyy
2 xxx,xzx
3 xyy
4 yzy,xzz,yzz
I have unsuccessfully used unstack, since it creates the values in Var2 as column names. Any suggestions would be much appreciated.

3 Comments

Is Var2 numeric?
No it is not. It is categorical of the form 'xyz'.
Var 1 is the output of a findgroups command, Var2 is categorical. Such as if Var1 was the output of a findgroups command on an ID variable, and Var2 was possessions (i.e person number one has a car and a house, while person number 4 may have a car, a house, and a computer)

Sign in to comment.

 Accepted Answer

If Var2 is a categorical array, then I assume that your xyz,xyy is actually a 1x2 categorical array [xyz, xyy], not a char array. If so:
T1 = table([1;1;2;2;3;4;4;4], categorical({'xyz';'xyy';'xxx';'xzx';'xyy';'yzy';'xzz';'yzz'}))
varfun(@(values) {values'}, T1, 'GroupingVariables', 'Var1')

3 Comments

Thank you! The table outputted is correct, however it says 1x11 categorical instead of displaying the values. Is there any way to fix this?
No, that's the built-in disp of a table, you can't change that. tables are not really designed to hold arrays in columns. It's more designed to have scalar values in each column.
Guillaume, you are correct that it's sometimes difficult to display multi-column variables in a table, but tables absolutely are designed to support them (and in fact it's the reason why "variables" in tables are called "variables", not "columns" in the doc). The display works as you would expect in cases with only a few columns in each variable.
>> table(rand(3,2),rand(3,5))
ans =
3×2 table
Var1 Var2
__________________ _______________________________________________________
0.15761 0.48538 0.42176 0.95949 0.84913 0.75774 0.65548
0.97059 0.80028 0.91574 0.65574 0.93399 0.74313 0.17119
0.95717 0.14189 0.79221 0.035712 0.67874 0.39223 0.70605
Actually, in your solution, the second variable in the result from varfun is in fact a cell array, each cell containing a categorical vector. There's nothing at all wrong with that, it's completely supported by tables, and in fact if the different groups have different numbers of rows in the original table it's pretty much necessary. But you are right that the display is not as informative as it could be in some cases.

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!