Filter a table by 2 columns

141 views (last 30 days)
Orongo
Orongo on 19 Feb 2019
Commented: Peter Perkins on 8 Nov 2021 at 21:02
Hi, I'm using 'patients' (load patients) to learn more about tables. I'm trying to filter the table by gender and smoker status by following
load patients
T = table(LastName,Age,Smoker,Gender)
Tsmokermale=(T.Smoker==true| T.Gender=='Male');
but get the error
Undefined operator '==' for input arguments of type 'cell'.
How can I filter the table by male and smoker, and still keep the information of LastName and Age?
  2 Comments
Orongo
Orongo on 27 Feb 2019
T = table(LastName,Age,Smoker,Gender)

Sign in to comment.

Answers (3)

Peter Perkins
Peter Perkins on 11 Mar 2019
Orongo, do this:
T.Gender = categorical(T.Gender);
Then this
Tsmokermale=(T.Smoker==true| T.Gender=='Male');
works as you expect, and what you are really asking about is this
T(Tsmokermale,:)
Long story short: Gender in the mat file is what's called a cell array of char vectors (a.k.a. a "cellstr"). Those don't support testing using ==, thus KSSV's suggestion of strcmp. But if you convert them to categorical, you can use ==. You could also convert to string, but categorical is more appropriate in this case.

KSSV
KSSV on 19 Feb 2019
idx = strcmp(T.Gender,'Male') ;
  1 Comment
Orongo
Orongo on 27 Feb 2019
This works fine for filtering the gender, just what I want. However, I'm missing the information about the row data. So, in my table below with heading {LastName, Age, Smoker, Gender} I get answer 1s and 0s with the command, but I want the last name, age and gender to filter through too.
'Smith' 38 1 'Male'
'Johnson' 43 0 'Male'
'Williams' 38 0 'Female'
'Jones' 40 0 'Female'
'Brown' 49 0 'Female'
'Davis' 46 0 'Female'
'Miller' 33 1 'Female'
'Wilson' 40 0 'Male'
'Moore' 28 0 'Male'
'Taylor' 31 0 'Female'
'Anderson' 45 0 'Female'
'Thomas' 42 0 'Female'
'Jackson' 25 0 'Male'
'White' 39 1 'Male'
'Harris' 36 0 'Female'
'Martin' 48 1 'Male'
'Thompson' 32 1 'Male'
'Garcia' 27 1 'Female'
'Martinez' 37 0 'Male'
'Robinson' 50 0 'Male'
'Clark' 48 0 'Female'
'Rodriguez' 39 0 'Female'
'Lewis' 41 0 'Female'
'Lee' 44 1 'Female'
'Walker' 28 1 'Female'
'Hall' 25 0 'Male'
'Allen' 39 0 'Female'
'Young' 25 0 'Female'
'Hernandez' 36 0 'Male'
'King' 30 1 'Male'
'Wright' 45 1 'Female'
'Lopez' 40 0 'Female'
'Hill' 25 0 'Female'
'Scott' 47 0 'Male'
'Green' 44 0 'Male'
'Adams' 48 0 'Female'
'Baker' 44 1 'Male'
'Gonzalez' 35 0 'Female'
'Nelson' 33 1 'Male'
'Carter' 38 0 'Female'
'Mitchell' 39 1 'Male'
'Perez' 44 0 'Male'
'Roberts' 44 1 'Male'
'Turner' 37 1 'Male'
'Phillips' 45 0 'Male'
'Campbell' 37 0 'Female'
'Parker' 30 0 'Male'
'Evans' 39 0 'Female'
'Edwards' 42 0 'Male'
'Collins' 42 1 'Male'
'Stewart' 49 1 'Male'
'Sanchez' 44 1 'Female'
'Morris' 43 1 'Female'
'Rogers' 47 0 'Female'
'Reed' 50 1 'Male'
'Cook' 38 0 'Female'
'Morgan' 41 0 'Female'
'Bell' 45 1 'Male'
'Murphy' 36 0 'Male'
'Bailey' 38 0 'Female'
'Rivera' 29 0 'Female'
'Cooper' 28 0 'Female'
'Richardson' 30 0 'Female'
'Cox' 28 0 'Female'
'Howard' 29 0 'Female'
'Ward' 36 0 'Male'
'Torres' 45 0 'Female'
'Peterson' 32 0 'Female'
'Gray' 31 0 'Female'
'Ramirez' 48 1 'Female'
'James' 25 0 'Male'
'Watson' 40 1 'Female'
'Brooks' 39 0 'Male'
'Kelly' 41 0 'Female'
'Sanders' 33 1 'Female'
'Price' 31 1 'Male'
'Bennett' 35 0 'Female'
'Wood' 32 0 'Male'
'Barnes' 42 0 'Male'
'Ross' 48 0 'Female'
'Henderson' 34 0 'Male'
'Coleman' 39 0 'Male'
'Jenkins' 28 1 'Male'
'Perry' 29 0 'Female'
'Powell' 32 0 'Female'
'Long' 39 1 'Male'
'Patterson' 37 1 'Female'
'Hughes' 49 1 'Female'
'Flores' 31 1 'Female'
'Washington' 37 0 'Female'
'Butler' 38 1 'Male'
'Simmons' 45 0 'Male'
'Foster' 30 0 'Female'
'Gonzales' 48 0 'Male'
'Bryant' 48 0 'Female'
'Alexander' 25 1 'Male'
'Russell' 44 1 'Male'
'Griffin' 49 0 'Male'
'Diaz' 45 1 'Male'
'Hayes' 48 0 'Male'

Sign in to comment.


VBBV
VBBV on 25 Sep 2021
Edited: VBBV on 25 Sep 2021
load patients
T = table(LastName,Age,Smoker,Gender)
T = 100×4 table
LastName Age Smoker Gender ____________ ___ ______ __________ {'Smith' } 38 true {'Male' } {'Johnson' } 43 false {'Male' } {'Williams'} 38 false {'Female'} {'Jones' } 40 false {'Female'} {'Brown' } 49 false {'Female'} {'Davis' } 46 false {'Female'} {'Miller' } 33 true {'Female'} {'Wilson' } 40 false {'Male' } {'Moore' } 28 false {'Male' } {'Taylor' } 31 false {'Female'} {'Anderson'} 45 false {'Female'} {'Thomas' } 42 false {'Female'} {'Jackson' } 25 false {'Male' } {'White' } 39 true {'Male' } {'Harris' } 36 false {'Female'} {'Martin' } 48 true {'Male' }
Tsmokermale=(T.Smoker==1 & T.Gender=="Male")
Tsmokermale = 100×1 logical array
1 0 0 0 0 0 0 0 0 0
T(Tsmokermale,:)
ans = 21×4 table
LastName Age Smoker Gender ____________ ___ ______ ________ {'Smith' } 38 true {'Male'} {'White' } 39 true {'Male'} {'Martin' } 48 true {'Male'} {'Thompson'} 32 true {'Male'} {'King' } 30 true {'Male'} {'Baker' } 44 true {'Male'} {'Nelson' } 33 true {'Male'} {'Mitchell'} 39 true {'Male'} {'Roberts' } 44 true {'Male'} {'Turner' } 37 true {'Male'} {'Collins' } 42 true {'Male'} {'Stewart' } 49 true {'Male'} {'Reed' } 50 true {'Male'} {'Bell' } 45 true {'Male'} {'Price' } 31 true {'Male'} {'Jenkins' } 28 true {'Male'}
Try something like this
  1 Comment
Peter Perkins
Peter Perkins on 8 Nov 2021 at 21:02
VBBV, you are pointing out that by comparing the cellstr T.Gender to the scalar string value "Male", the latter actually converts the cellstr to a string array, thus suporting ==.
Clever!
I would still argue though, that abandoning cellstrs in the data and replacing T.Gender etc. with their string vector equivalents is a better way to go. The patients dataset is quite old.

Sign in to comment.

Tags

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!