What fraction of input data is used for out-of-bag observations when creating a TREEBAGGER object using Statistics Toolbox 7.1 (R2009a)?
5 views (last 30 days)
Show older comments
I have created a TREEBAGGER object setting 'oobvarimp' to 'on'. I want to determine what fraction of observations are used as out-of-bag observations.
Accepted Answer
MathWorks Support Team
on 27 Aug 2009
For every tree, the bagger randomly selects N*bagger.FBoot out of N observations with replacement (default) for training. Observations that were not selected for training are out-of-bag observations. If bagger.FBoot=1 (default), on an average roughly 2/3 of input data is selected for training for every tree and the remaining 1/3 is used as out-of-bag observations. This number can fluctuate from one tree to another, and out-of-bag observations for one tree are not identical to out-of-bag observations for another tree.
You can use the following code as an example to determine the fraction of out-of-bag observations per tree.
load imports-85;
Y = X(:,1);
X = X(:,2:end);
ntrees = 50;
for j = [0.5 0.8 1]
b = TreeBagger(ntrees,X,Y,'oobvarimp','on','Fboot',j);
[obs vars] = size(b.X);
num_oob_per_tree = sum(sum(b.OOBIndices))/ntrees;
fprintf(['\n\nFor ' num2str(ntrees) ' trees and FBoot = ' num2str(j) ':\n'])
frac_oob_observations = num_oob_per_tree/obs
end
0 Comments
More Answers (0)
See Also
Categories
Find more on Classification Ensembles in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!