How to assign numbers to categorical values in a dataset?

6 views (last 30 days)
I'm preparing a dataset for machine learning. The dataset contains a column name "Holiday". The column contains more than a million row of values. It is categorical in nature and contains 4 unique values - 0 (as a string), a, b, c.
I want to assign the values 0 to 0 and 1 to the rest of them - a, b and c. How do I do that? Is there a readymade function?

Accepted Answer

Adam Danz
Adam Danz on 18 May 2020
Edited: Adam Danz on 18 May 2020
If you want to return logical values,
dummyVars = Holiday ~= '0'; % Holiday is categorical
If you want to return integer values,
dummyVars = double(Holiday ~= '0'); % Holiday is categorical
Note that any value of Holiday that doesn't equal 0 will be assigned a value of 1.
  4 Comments
pp
pp on 18 May 2020
Thanks! That did the job. Is it possible to extend this so that we can assign other numbers to a, b and c? Let's say 1, 2 and 3?
Adam Danz
Adam Danz on 18 May 2020
In that case, you can use
[groups, groupID] = findgroups(Holiday)
or
[groupID, groups] = grp2idx(a); % requires stats & ML toolbox

Sign in to comment.

More Answers (0)

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!