Classify observations using ensemble of classification models
[ also returns a matrix of classification scores (
the likelihood that a label comes from a particular class, using any
of the input arguments in the previous syntaxes. For each observation
X, the predicted class label corresponds to
the maximum score among all classes.
Predictor data to be classified, specified as a numeric matrix or table.
Each row of
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
Indices of weak learners
A logical matrix of size
Vector of classification labels.
A matrix with one row per observation and one column per class.
For each observation and each class, the score generated by each tree
is the probability of this observation originating from this class
computed as the fraction of observations of this class in a tree leaf.
Load Fisher's iris data set. Determine the sample size.
load fisheriris N = size(meas,1);
Partition the data into training and test sets. Hold out 10% of the data for testing.
rng(1); % For reproducibility cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices
Store the training data in a table.
tblTrn = array2table(meas(idxTrn,:)); tblTrn.Y = species(idxTrn);
Train a classification ensemble using AdaBoostM2 and the training set. Specify tree stumps as the weak learners.
t = templateTree('MaxNumSplits',1); Mdl = fitcensemble(tblTrn,'Y','Method','AdaBoostM2','Learners',t);
Predict labels for the test set. You trained model using a table of data, but you can predict labels using a matrix.
labels = predict(Mdl,meas(idxTest,:));
Construct a confusion matrix for the test set.
Mdl misclassifies one versicolor iris as virginica in the test set.
For ensembles, a classification score represents the confidence of a classification into a class. The higher the score, the higher the confidence.
Different ensemble algorithms have different definitions for their scores. Furthermore, the range of scores depends on ensemble type. For example:
AdaBoostM1 scores range from –∞
Bag scores range from
This function fully supports tall arrays. You can use models trained on either in-memory or tall data with this function.
For more information, see Tall Arrays.
Usage notes and limitations:
codegen (MATLAB Coder) to generate code for the
predict function. Save
a trained model by using
saveLearnerForCoder. Define an entry-point function
that loads the saved model by using
loadLearnerForCoder and calls the
predict function. Then use
to generate code for the entry-point function.
You can also generate single-precision C/C++ code for
predict. For single-precision code generation, specify the
name-value pair argument
'DataType','single' as an additional input to the
You can also generate fixed-point C/C++ code for
predict. Fixed-point code generation requires an additional step that
defines the fixed-point data types of the variables required for prediction. Create a
fixed-point data type structure by using the data type function
generateLearnerDataTypeFcn, and use the structure as an input argument of
loadLearnerForCoder in an entry-point function. Generating fixed-point
C/C++ code requires MATLAB®
Coder™ and Fixed-Point Designer™.
Generating fixed-point code for
propagating data types for individual learners and, therefore, can be time
This table contains
notes about the arguments of
predict. Arguments not included in this
table are fully supported.
|Argument||Notes and Limitations|
For the usage notes and limitations of the model object,
Code Generation of the
|Name-value pair arguments||
Names in name-value pair arguments must be compile-time constants. For example, to allow user-defined indices up to 5 weak learners in the generated
For fixed-point code generation, the
For more information, see Introduction to Code Generation.