How to group data within a column by specific text within that column

21 views (last 30 days)
I have a dataset of about 260,000 data points. One of the columns, "species_name'' has various species names within the column. How can I group this data by specific species names (and therefore, group the data in the other columns within the dataset (size, for example) by specific species names)?
  2 Comments
Adam Danz
Adam Danz on 7 Feb 2021
Are you just trying to index the table?
load fisheriris
T = table(categorical(species), meas(:,1),meas(:,2),meas(:,3),meas(:,4));
T.Properties.VariableNames{1} = 'Species'
T = 150x5 table
Species Var2 Var3 Var4 Var5 _______ ____ ____ ____ ____ setosa 5.1 3.5 1.4 0.2 setosa 4.9 3 1.4 0.2 setosa 4.7 3.2 1.3 0.2 setosa 4.6 3.1 1.5 0.2 setosa 5 3.6 1.4 0.2 setosa 5.4 3.9 1.7 0.4 setosa 4.6 3.4 1.4 0.3 setosa 5 3.4 1.5 0.2 setosa 4.4 2.9 1.4 0.2 setosa 4.9 3.1 1.5 0.1 setosa 5.4 3.7 1.5 0.2 setosa 4.8 3.4 1.6 0.2 setosa 4.8 3 1.4 0.1 setosa 4.3 3 1.1 0.1 setosa 5.8 4 1.2 0.2 setosa 5.7 4.4 1.5 0.4
T(T.Species=='virginica',:)
ans = 50x5 table
Species Var2 Var3 Var4 Var5 _________ ____ ____ ____ ____ virginica 6.3 3.3 6 2.5 virginica 5.8 2.7 5.1 1.9 virginica 7.1 3 5.9 2.1 virginica 6.3 2.9 5.6 1.8 virginica 6.5 3 5.8 2.2 virginica 7.6 3 6.6 2.1 virginica 4.9 2.5 4.5 1.7 virginica 7.3 2.9 6.3 1.8 virginica 6.7 2.5 5.8 1.8 virginica 7.2 3.6 6.1 2.5 virginica 6.5 3.2 5.1 2 virginica 6.4 2.7 5.3 1.9 virginica 6.8 3 5.5 2.1 virginica 5.7 2.5 5 2 virginica 5.8 2.8 5.1 2.4 virginica 6.4 3.2 5.3 2.3
Candice Cooper
Candice Cooper on 4 Mar 2021
I would accept this answer, but it's a comment. This is essentially what I was trying to do.
ind = find(T.Species=='virginica'), to use your example.

Sign in to comment.

Answers (2)

dpb
dpb on 6 Feb 2021
A sample dataset always helps, but probably be good to convert species to a categorical variable first (although not mandatory).
Then using grouping variables -- see
doc findgroups
doc splitapply
if keeping data in an array or look at
doc rowfun
for table, timetable.
  2 Comments
Candice Cooper
Candice Cooper on 6 Feb 2021
I've tried reading through those and attempting some stuff before posting this question, but I can't seem to figure it out. As an example, I have a column 'species_name' and within that column there is 'star' 'bat' 'crab' randomly dispersed throughout the column. I then have another column of 'size' that corresponds to each of those rows. I'm trying to single out, let's say, 'star' as it's own separate column and the sizes that correspond to those rows in another column.
dpb
dpb on 7 Feb 2021
Well, w/o something to work with, it's harder to guess...attach the table or .mat file with the data, or a short text listing of enough to illustrate.
Then, give us a precise definition of the problem to be solved.
Also, show us what you have tried and where you had a problem.
As I've pointed out in several related Q? recently, rarely do you really need to actually separate out the data into separate arrays; instead of duplicating data already have, use grouping variables and process as wanted.

Sign in to comment.


dpb
dpb on 7 Feb 2021
Illustration with faked data...
tmp=categorical({'star','bat','crab'}); % the categorical variable categories
t=table(tmp(randi(3,[20,1])).',randn(20,1),'VariableNames',{'Species','Size'}); % make up some data
>> head(t) % show what first little bit looks like...
ans =
8×2 table
Species Size
_______ ________
bat -0.65863
crab -1.2834
crab 0.23872
bat 1.5475
star 0.1869
star -1.8809
crab 0.40569
bat 0.64618
>> summary(t) % summary statistics on the table
Variables:
Species: 20×1 categorical
Values:
bat 6
crab 9
star 5
Size: 20×1 double
Values:
Min -1.8809
Median 0.21281
Max 1.5967
>> rowfun(@mean,t,'GroupingVariables','Species', ...
'InputVariables','Size','OutputVariableNames','GroupMean') % group means
ans =
3×3 table
Species GroupCount GroupMean
_______ __________ _________
bat 6 0.42427
crab 9 0.10477
star 5 -0.46693
>>
Can do whatever wanted...

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!