Classregtree - implementing an additional constraint

10 views (last 30 days)
Hej everybody,
I am using classregtree on a dataset with app. 200 entries from 25 data sources. I have been thinking of implementing an additional constraint into the tree growing process which demands that a new branch/leaf can only be grown if it contains samples from at least 3 different data sources. Is there an easy way to do this?
Thank you for your help,
John

Answers (1)

Ilya
Ilya on 13 Sep 2011
If you work in release 9b or later, the function searching for the optimal split in a tree node is coded in C for speed. The source code is not published. You could easily force every branch node in a tree to contain at least 3 classes; this part is coded in MATLAB. But you cannot impose the same requirement on a leaf node without modifying the C code or substituting it with your own (which I wouldn't recommend).
I don't know why you prefer a decision tree for your data and why you need at least 3 classes per node. If this is an attempt to improve the accuracy of predictions, there are better ways of accomplishing this.
  1 Comment
John Koestel
John Koestel on 14 Sep 2011
Hej Ilya,
Thank you for the quick answer. However, it looks as if it was rather not so easy to implement it (I would also need it for leaf-nodes).
The tree is rather needed to have a tool to make rough predictions which can be intuitively understood. However, I am thinking of ways to decrease the probability that the tree is grown on proxies for the experimental conditions used in the various data sources.
thanks and cheers,
John

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!