CART Algorithm with categorical predictor variables which consist of strings?
16 views (last 30 days)
I want to build a decision tree from categorical data, which consist of string names (in this case designating different chemical reaction types). Can I make Matlab to directly use these non-numerical data as predictor variable? Or do I have to convert the information in something numerical (which is in my case quite tedious...)
Thanks for your help (If you could maybe attach an example, that would be great! :)
PS: which function would you specifically recommend to use?
Tom Lane on 15 May 2012
If you use ClassificationTree.fit or RegressionTree.fit from the Statistics Toolbox, the input X matrix has to be numeric. However, the grp2idx function may make the conversion less tedious for you. Example:
X = [Weight grp2idx(Origin)];
a = ClassificationTree.fit(X,Cylinders,'cat',2);