CART Algorithm with categorical predictor variables which consist of strings?

9 views (last 30 days)
Ines
Ines on 15 May 2012
I want to build a decision tree from categorical data, which consist of string names (in this case designating different chemical reaction types). Can I make Matlab to directly use these non-numerical data as predictor variable? Or do I have to convert the information in something numerical (which is in my case quite tedious...)
Thanks for your help (If you could maybe attach an example, that would be great! :)
PS: which function would you specifically recommend to use?

Answers (1)

Tom Lane
Tom Lane on 15 May 2012
If you use ClassificationTree.fit or RegressionTree.fit from the Statistics Toolbox, the input X matrix has to be numeric. However, the grp2idx function may make the conversion less tedious for you. Example:
load carsmall
X = [Weight grp2idx(Origin)];
a = ClassificationTree.fit(X,Cylinders,'cat',2);
view(a,'mode','graph')

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!