How to find the optimal tree size when building a decision tree?
1 view (last 30 days)
I need to categorize a set of data and want to use the CART algorithm for this purpose. In the litterature the importance of finding the right tree size as a trade-off between classification accuracy and overfitting is stressed, so I am wondering whether the tree suggested by the matlab function classregtree corresponds to the max tree or is already an optimized version? If matlab already optimizes the depth of the tree, what is this optimization based on? Otherwise I was thinking of using crossvalidation for my tree and compare the reulting maxtrees for different samples.. Is there a recommended value for minparent? My dataset is quite small so I assume I have to use a value lower than the default in order to generate a reasonable tree...
Many thanks for your help!
Ilya on 21 May 2012
Grow a deep tree by setting minparent to 1, find the optimal pruning level using
and then prune the tree to the desired level using
Also, may I suggest that you either accept answers to your questions or explain why they do not satisfy you.