How to select number of trees and leaf size in bagged regression trees

12 views (last 30 days)
Hello guys,
I am using the function TreeBagger to create a regression model.
How can I evaluate the optimal structure, meaning number of trees and leaf size?
I have seen that the number of trees can be assessed using the oobError of the model, but I am not sure if what I am doing is correct.
I am using the following code:
model = TreeBagger(30, trainX, trainY, 'method', 'regression','oobpred', 'on', 'minleaf', 600);
plot(model.oobError);
Can anyone please tell me if this is correct, and how to do the same thing for leaf size?
I am new to Regression Trees so any help would be very much appreciated.
Many thanks,
Natalia

Accepted Answer

Amogh Bhole
Amogh Bhole on 19 Jun 2020
Hi,
Whenever you are dealing with machine learning models there is no specific rule to take the parameters, these parameters change according to dataset and the result you are expecting.
To answer your question if processing time and memory is not a constraint in your case you can use as many trees as possible. In general, the more trees you use the better results you get.
When it comes to the number of leaf nodes, you don’t want your model to overfit. Use Bias vs Variance trade-off in order to choose the number of leaf nodes wrt your dataset.
For implementation level information refer to the links:
Related to TreeBagger - Link1
Ways to implement Bagging - Link2

More Answers (0)

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!