Would kfold loss values vary if cross validation is performed after model training?
Show older comments
I am concerned about the difference in cross validated (CV) predictions (kfoldpredict) in regression bagged ensembles (fitrensemble) if CV is performed after a model has been trained. If I understand this correctly, a fitrensemble model without CV will have access to all available variables in a data set. Thus generated trees will have a unique set of node split values different from node split values found in trees generated from a fitrensemble with CV on. Differences in these split values would then lead to an overall difference in possible outcomes for constructed trees in both models.
I guess this would boil down to, does the crossval and subsequent kfoldloss or kfoldpredict (really any CV predict functions) functions account for these differences when supplied a model that did not peform initial cross validation?
If there is an error in my thoughts, please let me know.
I tried to supply an example of my question below.
% No initial CV
Mdl = fitrensemble(looperValues(:,1:cherrios), allratios2,... 'Learners',t,'Weights',W1,'Method','Bag','NumLearningCycles',numblearningcyc,'Options',statset('UseParallel',true));
Mdl_CV_After_Training = crossval(MdllooperPhyschemMexB, 'KFold', 10);
Mdl_CV_After_Training_kfold_predictions = kfoldpredict(Mdl_CV_After_Training)
VS
% Yes initial CV
Mdl = fitrensemble(looperValues(:,1:cherrios), allratios2, 'Learners', t, 'Crossval', 'On','Weights',W1,'Method','Bag','NumLearningCycles',numblearningcyc,'Options',statset('UseParallel',true));
Mdl_Yes_CV_kfold_predictions = kfoldpredict(Mdl_CV_After_Training)
% Would Mdl_CV_After_Training_kfold_predictions == Mdl_Yes_CV_kfold_predictions?
Accepted Answer
More Answers (0)
Categories
Find more on Classification Trees in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!