Receiver operating characteristic (ROC) curve or other performance curve for classifier output
[X,Y] =
perfcurve(labels,scores,posclass)
example[X,Y,T]
= perfcurve(labels,scores,posclass)
[X,Y,T,AUC]
= perfcurve(labels,scores,posclass)
example[X,Y,T,AUC,OPTROCPT]
= perfcurve(labels,scores,posclass)
example[X,Y,T,AUC,OPTROCPT,SUBY]
= perfcurve(labels,scores,posclass)
[X,Y,T,AUC,OPTROCPT,SUBY,SUBYNAMES]
= perfcurve(labels,scores,posclass)
example[___] = perfcurve(labels,scores,posclass,Name,Value)
[___] = perfcurve(
returns
the coordinates of a ROC curve and any other output argument from
the previous syntaxes, with additional options specified by one or
more labels
,scores
,posclass
,Name,Value
)Name,Value
pair arguments.
For example, you can provide a list of negative classes, change
the X
or Y
criterion, compute pointwise confidence
bounds using cross validation or bootstrap, specify the misclassification
cost, or compute the confidence bounds in parallel.
Load the sample data.
load fisheriris
Use only the first two features as predictor variables. Define a binary classification problem by using only the measurements that correspond to the species versicolor and virginica.
pred = meas(51:end,1:2);
Define the binary response variable.
resp = (1:100)'>50; % Versicolor = 0, virginica = 1
Fit a logistic regression model.
mdl = fitglm(pred,resp,'Distribution','binomial','Link','logit');
Compute the ROC curve. Use the probability estimates from the logistic regression model as scores.
scores = mdl.Fitted.Probability;
[X,Y,T,AUC] = perfcurve(species(51:end,:),scores,'virginica');
perfcurve
stores the threshold values in the array T
.
Display the area under the curve.
AUC
AUC = 0.7918
The area under the curve is 0.7918. The maximum AUC is 1, which corresponds to a perfect classifier. Larger AUC values indicate better classifier performance.
Plot the ROC curve.
plot(X,Y) xlabel('False positive rate') ylabel('True positive rate') title('ROC for Classification by Logistic Regression')
Load the sample data.
load ionosphere
X
is a 351x34 realvalued matrix of predictors. Y
is a character array of class labels: 'b'
for bad radar returns and 'g'
for good radar returns.
Reformat the response to fit a logistic regression. Use the predictor variables 3 through 34.
resp = strcmp(Y,'b'); % resp = 1, if Y = 'b', or 0 if Y = 'g' pred = X(:,3:34);
Fit a logistic regression model to estimate the posterior probabilities for an iris to be a virginica.
mdl = fitglm(pred,resp,'Distribution','binomial','Link','logit'); score_log = mdl.Fitted.Probability; % Probability estimates
Compute the standard ROC curve using the probabilities for scores.
[Xlog,Ylog,Tlog,AUClog] = perfcurve(resp,score_log,'true');
Train an SVM classifier on the same sample data. Standardize the data.
mdlSVM = fitcsvm(pred,resp,'Standardize',true);
Compute the posterior probabilities (scores).
mdlSVM = fitPosterior(mdlSVM); [~,score_svm] = resubPredict(mdlSVM);
The second column of score_svm
contains the posterior probabilities of bad radar returns.
Compute the standard ROC curve using the scores from the SVM model.
[Xsvm,Ysvm,Tsvm,AUCsvm] = perfcurve(resp,score_svm(:,mdlSVM.ClassNames),'true');
Fit a naive Bayes classifier on the same sample data.
mdlNB = fitcnb(pred,resp);
Compute the posterior probabilities (scores).
[~,score_nb] = resubPredict(mdlNB);
Compute the standard ROC curve using the scores from the naive Bayes classification.
[Xnb,Ynb,Tnb,AUCnb] = perfcurve(resp,score_nb(:,mdlNB.ClassNames),'true');
Plot the ROC curves on the same graph.
plot(Xlog,Ylog) hold on plot(Xsvm,Ysvm) plot(Xnb,Ynb) legend('Logistic Regression','Support Vector Machines','Naive Bayes','Location','Best') xlabel('False positive rate'); ylabel('True positive rate'); title('ROC Curves for Logistic Regression, SVM, and Naive Bayes Classification') hold off
Although SVM produces better ROC values for higher thresholds, logistic regression is usually better at distinguishing the bad radar returns from the good ones. The ROC curve for naive Bayes is generally lower than the other two ROC curves, which indicates worse insample performance than the other two classifier methods.
Compare the area under the curve for all three classifiers.
AUClog AUCsvm AUCnb
AUClog = 0.9659 AUCsvm = 0.9488 AUCnb = 0.9393
Logistic regression has the highest AUC measure for classification and naive Bayes has the lowest. This result suggests that logistic regression has better insample average performance for this sample data.
Generate a random set of points within the unit circle.
rng(1); % For reproducibility n = 100; % Number of points per quadrant r1 = sqrt(rand(2*n,1)); % Random radii t1 = [pi/2*rand(n,1); (pi/2*rand(n,1)+pi)]; % Random angles for Q1 and Q3 X1 = [r1.*cos(t1) r1.*sin(t1)]; % PolartoCartesian conversion r2 = sqrt(rand(2*n,1)); t2 = [pi/2*rand(n,1)+pi/2; (pi/2*rand(n,1)pi/2)]; % Random angles for Q2 and Q4 X2 = [r2.*cos(t2) r2.*sin(t2)];
Define the predictor variables. Label points in the first and third quadrants as belonging to the positive class, and those in the second and fourth quadrants in the negative class.
pred = [X1; X2];
resp = ones(4*n,1);
resp(2*n + 1:end) = 1; % Labels
Create the function mysigmoid.m
, which
accepts two matrices in the feature space as inputs, and transforms
them into a Gram matrix using the sigmoid kernel.
function G = mysigmoid(U,V) % Sigmoid kernel function with slope gamma and intercept c gamma = 1; c = 1; G = tanh(gamma*U*V' + c); end
Train an SVM classifier using the sigmoid kernel function. It is good practice to standardize the data.
SVMModel1 = fitcsvm(pred,resp,'KernelFunction','mysigmoid',... 'Standardize',true); SVMModel1 = fitPosterior(SVMModel1); [~,scores1] = resubPredict(SVMModel1);
Set gamma = 0.5
; within mysigmoid.m
.
Then, train an SVM classifier using the adjusted sigmoid kernel.
SVMModel2 = fitcsvm(pred,resp,'KernelFunction','mysigmoid',... 'Standardize',true); SVMModel2 = fitPosterior(SVMModel2); [~,scores2] = resubPredict(SVMModel2);
Compute the ROC curves and the area under the curve (AUC) for both models.
[x1,y1,~,auc1] = perfcurve(resp,scores1(:,2),1); [x2,y2,~,auc2] = perfcurve(resp,scores2(:,2),1);
Plot the ROC curves.
plot(x1,y1) hold on plot(x2,y2) hold off legend('gamma = 1','gamma = 0.5','Location','SE'); xlabel('False positive rate'); ylabel('True positive rate'); title('ROC for classification by SVM');
The kernel function with the gamma parameter set to 0.5 gives better insample results.
Compare the AUC measures.
auc1 auc2
auc1 = 0.9518 auc2 = 0.9985
The area under the curve for gamma set to 0.5 is higher than that for gamma set to 1. This also confirms that gamma parameter value of 0.5 produces better results. For visual comparison of the classification performance with these two gamma parameter values, see Train SVM Classifiers Using a Custom Kernel.
Load the sample data.
load fisheriris
The column vector, species
, consists of iris flowers of three different species: setosa, versicolor, virginica. The double matrix meas
consists of four types of measurements on the flowers: sepal length, sepal width, petal length, and petal width. All measures are in centimeters.
Train a classification tree using the sepal length and width as the predictor variables. It is a good practice to specify the class names.
Model = fitctree(meas(:,1:2),species,... 'ClassNames',{'setosa','versicolor','virginica'});
Predict the class labels and scores for the species based on the tree Model
.
[~,score] = resubPredict(Model);
The scores are the posterior probabilities that an observation (a row in the data matrix) belongs to a class. The columns of score
correspond to the classes specified by 'ClassNames'
. So, the first column corresponds to setosa, the second corresponds to versicolor, and the third column corresponds to virginica.
Compute the ROC curve for the predictions that an observation belongs to versicolor, given the true class labels species
. Also compute the optimal operating point and y values for negative subclasses. Return the names of the negative classes.
[X,Y,T,~,OPTROCPT,suby,subnames] = perfcurve(species,... score(:,2),'versicolor');
X
, by default, is the false positive rate (fallout or 1specificity) and Y
, by default, is the true positive rate (recall or sensitivity). The positive class label is versicolor
. Because a negative class is not defined, perfcurve
assumes that the observations that do not belong to the positive class are in one class. The function accepts it as the negative class.
OPTROCPT suby subnames
OPTROCPT = 0.1000 0.8000 suby = 0 0 0.1800 0.1800 0.4800 0.4800 0.5800 0.5800 0.6200 0.6200 0.8000 0.8000 0.8800 0.8800 0.9200 0.9200 0.9600 0.9600 0.9800 0.9800 1.0000 1.0000 1.0000 1.0000 subnames = 'setosa' 'virginica'
Plot the ROC curve and the optimal operating point on the ROC curve.
plot(X,Y) hold on plot(OPTROCPT(1),OPTROCPT(2),'ro') xlabel('False positive rate') ylabel('True positive rate') title('ROC Curve for Classification by Classification Trees') hold off
Find the threshold that corresponds to the optimal operating point.
T((X==OPTROCPT(1))&(Y==OPTROCPT(2)))
ans = 0.6429
Specify virginica
as the negative class and compute and plot the ROC curve for versicolor
.
[X,Y,~,~,OPTROCPT] = perfcurve(species,score(:,2),... 'versicolor','negClass','virginica'); OPTROCPT plot(X,Y) hold on plot(OPTROCPT(1),OPTROCPT(2),'ro') xlabel('False positive rate') ylabel('True positive rate') title('ROC Curve for Classification by Classification Trees') hold off
OPTROCPT = 0.1800 0.8000
Load the sample data.
load fisheriris
The column vector species
consists of iris flowers of three different species: setosa, versicolor, virginica. The double matrix meas
consists of four types of measurements on the flowers: sepal length, sepal width, petal length, and petal width. All measures are in centimeters.
Use only the first two features as predictor variables. Define a binary problem by using only the measurements that correspond to the versicolor and virginica species.
pred = meas(51:end,1:2);
Define the binary response variable.
resp = (1:100)'>50; % Versicolor = 0, virginica = 1
Fit a logistic regression model.
mdl = fitglm(pred,resp,'Distribution','binomial','Link','logit');
Compute the pointwise confidence intervals on the true positive rate (TPR) by vertical averaging (VA) and sampling using bootstrap.
[X,Y,T] = perfcurve(species(51:end,:),mdl.Fitted.Probability,... 'virginica','NBoot',1000,'XVals',[0:0.05:1]);
'NBoot',1000
sets the number of bootstrap replicas to 1000. 'XVals','All'
prompts perfcurve
to return X
, Y
, and T
values for all scores, and average the Y
values (true positive rate) at all X
values (false positive rate) using vertical averaging. If you do not specify XVals
, then perfcurve
computes the confidence bounds using threshold averaging by default.
Plot the pointwise confidence intervals.
errorbar(X,Y(:,1),Y(:,1)Y(:,2),Y(:,3)Y(:,1)); xlim([0.02,1.02]); ylim([0.02,1.02]); xlabel('False positive rate') ylabel('True positive rate') title('ROC Curve with Pointwise Confidence Bounds') legend('PCBwVA','Location','Best')
It might not always be possible to control the false positive rate (FPR, the X
value in this example). So you might want to compute the pointwise confidence intervals on true positive rates (TPR) by threshold averaging.
[X1,Y1,T1] = perfcurve(species(51:end,:),mdl.Fitted.Probability,... 'virginica','NBoot',1000);
If you set 'TVals'
to 'All'
, or if you do not specify 'TVals'
or 'Xvals'
, then perfcurve
returns X
, Y
, and T
values for all scores and computes pointwise confidence bounds for X
and Y
using threshold averaging.
Plot the confidence bounds.
figure() errorbar(X1(:,1),Y1(:,1),Y1(:,1)Y1(:,2),Y1(:,3)Y1(:,1)); xlim([0.02,1.02]); ylim([0.02,1.02]); xlabel('False positive rate') ylabel('True positive rate') title('ROC Curve with Pointwise Confidence Bounds') legend('PCBwTA','Location','Best')
Specify the threshold values to fix and compute the ROC curve. Then plot the curve.
[X1,Y1,T1] = perfcurve(species(51:end,:),mdl.Fitted.Probability,... 'virginica','NBoot',1000,'TVals',0:0.05:1); figure() errorbar(X1(:,1),Y1(:,1),Y1(:,1)Y1(:,2),Y1(:,3)Y1(:,1)); xlim([0.02,1.02]); ylim([0.02,1.02]); xlabel('False positive rate') ylabel('True positive rate') title('ROC Curve with Pointwise Confidence Bounds') legend('PCBwTA','Location','Best')
labels
— True class labelsnumeric vector  logical vector  character matrix  cell array of strings  categorical arrayTrue class labels, specified as a numeric vector, a logical vector, a character matrix, a cell array of strings, or a categorical array. For more information, see Grouping Variables.
Example: {'hi','mid','hi','low',...,'mid'}
Example: ['H','M','H','L',...,'M']
Data Types: single
 double
 logical
 char
 cell
scores
— Scores returned by a classifiervector of floating pointsScores returned by a classifier for some sample data, specified
as a vector of floating points. scores
must have
the same number of elements as labels
.
Data Types: single
 double
posclass
— Positive class labelnumeric value  logical value  character array  cell array of strings  categorical valuePositive class label, specified as a numeric value, a logical
value, a character array, or a cell array of strings. The positive
class must be a member of the input labels. The value of posclass
that
you can specify depends on the value of labels
.
labels value  posclass value 

Numeric vector  Numeric scalar 
Logical vector  Logical scalar 
Character matrix  Character string 
Cell array of strings  Character string or cell containing character string 
Categorical vector  Categorical scalar 
For example, in a cancer diagnosis problem, if a malignant tumor
is the positive class, then specify posclass
as 'malignant'
.
Data Types: single
 double
 logical
 char
 cell
Specify optional commaseparated pairs of Name,Value
arguments.
Name
is the argument
name and Value
is the corresponding
value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN
.
'NegClass','versicolor','XCrit','fn','NBoot',1000,'BootType','per'
specifies
the species versicolor as the negative class, the criterion for the
Xcoordinate as false negative, the number of bootstrap samples as
1000. It also specifies that the pointwise confidence bounds are computed
using the percentile method.'NegClass'
— List of negative classes'all'
(default)  numeric array  categorical arrayList of negative classes, specified as the commaseparated pair
consisting of 'NegClass'
, and a numeric array or
a categorical array. By default, perfcurve
sets NegClass
to 'all'
and
considers all nonpositive classes found in the input array of labels
to be negative.
If NegClass
is a subset of the classes
found in the input array of labels, then perfcurve
discards
the instances with labels that do not belong to either positive or
negative classes.
Example: 'nNegClass',{'versicolor','setosa'}
Data Types: single
 double
'XCrit'
— Criterion to compute for X
'fpr'
(default)  'fnr'
 'tnr'
 'ppv'
 'ecost'
 ...Criterion to compute for X
, specified as
the commaseparated pair consisting of 'XCrit'
and
one of the following.
Criterion  Description 

tp  Number of true positive instances 
fn  Number of false negative instances. 
fp  Number of false positive instances. 
tn  Number of true negative instances. 
tp+fp  Sum of true positive and false positive instances. 
rpp  Rate of positive predictions.rpp =
(tp+fp)/(tp+fn+fp+tn) 
rnp  Rate of negative predictions. rnp
= (tn+fn)/(tp+fn+fp+tn) 
accu  Accuracy. accu = (tp+tn)/(tp+fn+fp+tn) 
tpr , or sens , or reca  True positive rate, or sensitivity, or recall. tpr=
sens = reca = tp/(tp+fn) 
fnr , or miss  False negative rate, or miss. fnr
= miss = fn/(tp+fn) 
fpr , or fall  False positive rate, or fallout, or 1 – specificity. fpr = fall = fp/(tn+fp) 
tnr , or spec  True negative rate, or specificity. tnr
= spec = tn/(tn+fp) 
ppv , or prec  Positive predictive value, or precision. ppv
= prec = tp/(tp+fp) 
npv  Negative predictive value. npv = tn/(tn+fn) 
ecost  Expected cost. ecost = (tp*Cost(PP)+fn*Cost(NP)+fp*
Cost(PN)+tn*Cost(NN))/(tp+fn+fp+tn) 
Custom criterion  A customdefined function with the input arguments (C,scale,cost) ,
where C is a 2by2 confusion matrix, scale is
a 2by1 array of class scales, and cost is a 2by2
misclassification cost matrix. 
Caution
Some of these criteria return 
Example: 'XCrit','ecost'
'YCrit'
— Criterion to compute for Y
tpr
(default)  same criteria options for X
'XVals'
— Values for the X
criterion'all'
(default)  numeric arrayValues for the X
criterion, specified as
the commaseparated pair consisting of 'XVals'
and
a numeric array.
If you specify XVals
, then perfcurve
computes X
and Y
and
the pointwise
confidence bounds for Y
(when applicable)
only for the specified XVals
.
If you do not specify XVals
,
then perfcurve
, computes X
and Y
and
the values for all scores by default.
Note:
You cannot set 
Example: 'XVals',[0:0.05:1]
Data Types: single
 double
'TVals'
— Thresholds for the positive class score'all'
(default)  numeric arrayThresholds for the positive class score, specified as the commaseparated
pair consisting of 'TVals'
and either 'all'
or
a numeric array.
If TVals
is set to 'all'
or
not specified, and XVals
is not specified, then perfcurve
returns X
, Y
,
and T
values for all scores and computes pointwise confidence
bounds for X
and Y
using
threshold averaging.
If TVals
is set to a numeric
array, then perfcurve
returns X
, Y
,
and T
values for the specified thresholds and computes
pointwise confidence bounds for X
and Y
at
these thresholds using threshold averaging.
Note:
You cannot set 
Example: 'TVals',[0:0.05:1]
Data Types: single
 double
'UseNearest'
— Indicator to use the nearest values in the data'on'
(default)  'off'
Indicator to use the nearest values in the data instead of the
specified numeric XVals
or TVals
,
specified as the commaseparated pair consisting of 'UseNearest'
and
either 'on'
or 'off'
.
If you specify numeric XVals
and
set UseNearest
to 'on'
, then perfcurve
returns
the nearest unique X
values found in the data,
and it returns the corresponding values of Y
and T
.
If you specify numeric XVals
and
set UseNearest
to 'off'
, then perfcurve
returns
the sorted XVals
.
If you compute confidence bounds by cross validation
or bootstrap, then this parameter is always 'off'
.
Example: 'UseNearest','off'
'ProcessNaN'
— perfcurve
method for processing NaN
scores'ignore'
(default)  'addtofalse'
perfcurve
method for processing NaN
scores,
specified as the commaseparated pair consisting of 'ProcessNaN'
and 'ignore'
or 'addtofalse'
.
If ProcessNaN
is 'ignore'
,
then perfcurve
removes observations with NaN
scores
from the data.
If ProcessNaN
is 'addtofalse'
,
then perfcurve
adds instances with NaN
scores
to false classification counts in the respective class. That is, perfcurve
always
counts instances from the positive class as false negative (FN), and
it always counts instances from the negative class as false positive
(FP).
Example: 'ProcessNaN','addtofalse'
'Prior'
— Prior probabilities for positive and negative classes'empirical'
(default)  'uniform'
 array with two elementsPrior probabilities for positive and negative classes, specified
as the commaseparated pair consisting of 'Prior'
and 'empirical'
, 'uniform'
,
or an array with two elements.
If Prior
is 'empirical'
,
then perfcurve
derives prior probabilities from
class frequencies.
If Prior
is 'uniform'
,
then perfcurve
sets all prior probabilities to
be equal.
Example: 'Prior',[0.3,0.7]
Data Types: single
 double
 char
'Cost'
— Misclassification costs[0 0.5;0.5 0]
(default)  2by2 matrixMisclassification costs, specified as the commaseparated pair
consisting of 'Cost'
and a 2by2 matrix, containing [Cost(PP),Cost(NP);Cost(PN),Cost(NN)]
.
Cost(NP)
is the cost of misclassifying a
positive class as a negative class. Cost(PN)
is
the cost of misclassifying a negative class as a positive class. Usually, Cost(PP)
=
0 and Cost(NN)
= 0, but perfcurve
allows
you to specify nonzero costs for correct classification as well.
Example: 'Cost',[0 0.7;0.3 0]
Data Types: single
 double
'Alpha'
— Confidence level0.05 (default)  scalar value in the range 0 through 1Confidence level for the confidence bounds, specified as the
commaseparated pair consisting of 'Alpha'
and
a scalar value in the range 0 through 1. perfcurve
computes
100*(1 – α) percent pointwise confidence
bounds for X
, Y
, T
,
and AUC
for a confidence level of α.
Example: 'Alpha',0.01
specifies 99% confidence
bounds
Data Types: single
 double
'Weights'
— Observation weights (default)  vector of nonnegative scalar values  cell array of vectors of nonnegative scalar valuesObservation weights, specified as the commaseparated pair consisting
of 'Weights'
and a vector of nonnegative scalar
values. This vector must have as many elements as scores
or labels
do.
If scores
and labels
are
in cell arrays and you need to supply Weights
,
the weights must be in a cell array as well. In this case, every element
in Weights
must be a numeric vector with as many
elements as the corresponding element in scores
.
For example, numel(weights{1}) == numel(scores{1})
.
When perfcurve
computes the X
, Y
and T
or
confidence bounds using crossvalidation, it uses these observation
weights instead of observation counts.
When perfcurve
computes confidence bounds
using bootstrap, it samples N out of N observations
with replacement, using these weights as multinomial sampling probabilities.
Data Types: single
 double
 cell
'NBoot'
— Number of bootstrap replicas0 (default)  positive integerNumber of bootstrap replicas for computation of confidence bounds,
specified as the commaseparated pair consisting of 'NBoot'
and
a positive integer. The default value 0 means the confidence bounds
are not computed.
If labels
and scores
are
cell arrays, this parameter must be 0 because perfcurve
can
use either crossvalidation or bootstrap to compute confidence bounds.
Example: 'NBoot',500
Data Types: single
 double
'BootType'
— Confidence interval type for bootci
'bca'
(default)  'norm
 'per'
 'cper'
 'stud'
Confidence interval type for bootci
to
use to compute confidence bounds, specified as the commaseparated
pair consisting of 'BootType'
and one of the following:
'bca'
— Bias corrected and
accelerated percentile method
'norm
or 'normal'
—
Normal approximated interval with bootstrapped bias and standard error
'per'
or 'percentile'
—
Percentile method
'cper'
or 'corrected percentile'
—
Bias corrected percentile method
'stud'
or 'student'
—
Studentized confidence interval
Example: 'BootType','cper'
'BootArg'
— Optional input arguments for bootci
[ ] (default)  Optional input arguments for bootci
to
compute confidence bounds, specified as the commaseparated pair consisting
of 'BootArg'
and one of the inputs or namevalue
pair arguments that bootci
accepts.
Example: 'BootArg',{'stderr',stderr}
specifies
the standard error of the bootstrap statistics
'Options'
— Options for controlling the computation of confidence intervals[]
(default)  structure array returned by statset
Options for controlling the computation of confidence intervals,
specified as the commaseparated pair consisting of 'Options'
and
a structure array returned by statset
.
These options require Parallel Computing Toolbox™. perfcurve
uses
this argument for computing pointwise confidence bounds only. To compute
these bounds, you must pass cell arrays for labels
and scores
or
set NBoot
to a positive integer.
This table summarizes the available options.
Option  Description 

'UseParallel' 

'UseSubstreams' 

'Streams'  A RandStream object,
or a cell array of such objects. If you specify Streams ,
use a single object, except when:
In that case, use a cell array of the same size
as the parallel pool. If a parallel pool is not open, then 
If 'UseParallel'
is true
and 'UseSubstreams'
is false
,
then the length of 'Streams'
must equal the number
of workers used by perfcurve
. If a parallel pool
is already open, then the length of 'Streams'
is
the size of the parallel pool. If a parallel pool is not already open,
then MATLAB^{®} might open a pool for you, depending on your installation
and preferences. To ensure more predictable results, use parpool
and explicitly create a parallel
pool before invoking perfcurve
and setting 'Options',statset('UseParallel',true)
.
Example: 'Options',statset('UseParallel',true)
Data Types: struct
X
— xcoordinates for the performance curvevector, fpr
(default)  mby3 matrixxcoordinates for the performance curve,
returned as a vector or an mby3 matrix. By default, X
values
are the false positive rate, FPR (fallout or 1 – specificity).
To change X
, use the XCrit
namevalue
pair argument.
If perfcurve
does not compute
the pointwise
confidence bounds, or if it computes them using vertical averaging,
then X
is a vector.
If perfcurve
computes the confidence
bounds using threshold averaging, then X
is an mby3
matrix, where m is the number of fixed threshold
values. The first column of X
contains the mean
value. The second and third columns contain the lower bound and the
upper bound, respectively, of the pointwise confidence bounds.
Y
— ycoordinates for the performance curvevector, tpr
(default)  mby3 matrixycoordinates for the performance curve,
returned as a vector or an mby3 matrix. By default, Y
values
are the true positive rate, TPR (recall or sensitivity). To change Y
,
use YCrit
namevalue pair argument.
If perfcurve
does not compute
the pointwise
confidence bounds, then Y
is a vector.
If perfcurve
computes the confidence
bounds, then Y
is an mby3
matrix, where m is the number of fixed X
values
or thresholds (T
values). The first column of Y
contains
the mean value. The second and third columns contain the lower bound
and the upper bound, respectively, of the pointwise confidence bounds.
T
— Thresholds on classifier scoresvector  mby3 matrixThresholds on classifier scores for the computed values of X
and Y
,
returned as a vector or mby3 matrix.
If perfcurve
does not compute
the pointwise
confidence bounds, or computes them using threshold averaging,
then T
is a vector.
If perfcurve
computes the confidence
bounds using vertical averaging, T
is an mby3
matrix, where m is the number of fixed X
values.
The first column of T
contains the mean value.
The second and third columns contain the lower bound, and the upper
bound, respectively, of the pointwise confidence bounds.
For each threshold, TP
is the count of true
positive observations with scores greater than or equal to this threshold,
and FP
is the count of false positive observations
with scores greater than or equal to this threshold. perfcurve
defines
negative counts, TN
and FN
,
in a similar way. The function then sorts the thresholds in the descending
order that corresponds to the ascending order of positive counts.
For the m distinct thresholds found in the
array of scores, perfcurve
returns the X
, Y
and T
arrays
with m + 1 rows. perfcurve
sets
elements T(2:m+1)
to the distinct
thresholds, and T(1)
replicates T(2)
.
By convention, T(1)
represents the highest 'reject
all'
threshold, and perfcurve
computes
the corresponding values of X
and Y
for TP
= 0
and FP = 0
. The T(end)
value
is the lowest 'accept all'
threshold for which TN
= 0
and FN = 0
.
AUC
— Area under the curvescalar value  3by1 vectorArea under the curve (AUC
) for the computed
values of X
and Y
, returned
as a scalar value or a 3by1 vector.
If perfcurve
does not compute
the pointwise
confidence bounds, AUC
is a scalar value.
If perfcurve
computes the confidence
bounds using vertical averaging, AUC
is a 3by1
vector. The first column of AUC
contains the
mean value. The second and third columns contain the lower bound and
the upper bound, respectively, of the confidence bound.
For a perfect classifier, AUC = 1. For a classifier that randomly assigns observations to classes, AUC = 0.5.
If you set XVals
to 'all'
(default),
then perfcurve
computes AUC
using
the returned X
and Y
values.
If XVals
is a numeric array, then perfcurve
computes AUC
using X
and Y
values
from all distinct scores in the interval, which are specified by the
smallest and largest elements of XVals
. More
precisely, perfcurve
finds X
values
for all distinct thresholds as if XVals
were
set to 'all'
, and then uses a subset of these (with
corresponding Y
values) between min(XVals)
and max(XVals)
to
compute AUC
.
perfcurve
uses trapezoidal approximation
to estimate the area. If the first or last value of X
or Y
are NaN
s,
then perfcurve
removes them to allow calculation
of AUC
. This takes care of criteria that produce NaN
s
for the special 'reject all'
or 'accept
all'
thresholds, for example, positive predictive value
(PPV) or negative predictive value (NPV).
OPTROCPT
— Optimal operating point of the ROC curve1by2 arrayOptimal operating point of the ROC curve, returned as a 1by2 array with false positive rate (FPR) and true positive rate (TPR) values for the optimal ROC operating point.
perfcurve
computes OPTROCPT
for
the standard ROC curve only, and sets to NaN
s otherwise.
To obtain the optimal operating point for the ROC curve, perfcurve
first
finds the slope, S, using
$$S=\frac{\text{Cost}(PN)\text{Cost}(NN)}{\text{Cost}(NP)\text{Cost}(PP)}*\frac{N}{P}$$
Cost(NP) is the cost of misclassifying a positive class as a negative class. Cost(PN) is the cost of misclassifying a negative class as a positive class.
P = TP + FN and N = TN + FP. They are the total instance counts in the positive and negative class, respectively.
perfcurve
then finds the optimal
operating point by moving the straight line with slope S from
the upper left corner of the ROC plot (FPR = 0
, TPR
= 1
) down and to the right, until it intersects the ROC
curve.
SUBY
— Values for negative subclassesarrayValues for negative subclasses, returned as an array.
If you specify only one negative class, then SUBY
is
identical to Y
.
If you specify k negative classes,
then SUBY
is a matrix of size mbyk,
where m is the number of returned values for X
and Y
,
and k is the number of negative classes. perfcurve
computes Y
values
by summing counts over all negative classes.
SUBY
gives values of the Y
criterion
for each negative class separately. For each negative class, perfcurve
places
a new column in SUBY
and fills it with Y
values
for true negative (TN) and false positive (FP) counted just for this
class.
SUBYNAMES
— Negative class namescell arrayNegative class names, returned as a cell array.
If you provide an input array of negative class names,
, NegClass
, then perfcurve
copies
names into SUBYNAMES
.
If you do not provide NegClass
,
then perfcurve
extracts SUBYNAMES
from
the input labels. The order of SUBYNAMES
is the
same as the order of columns in SUBY
. That is, SUBY(:,1)
is
for negative class SUBYNAMES{1}
, SUBY(:,2)
is
for negative class SUBYNAMES{2}
, and so on.
If you supply cell arrays for labels
and scores
,
or if you set NBoot
to a positive integer, then perfcurve
returns
pointwise confidence bounds for X
,Y
,T
,
and AUC
. You cannot supply cell arrays for labels
and scores
and
set NBoot
to a positive integer at the same time.
perfcurve
resamples data to compute confidence
bounds using either cross validation or bootstrap.
Crossvalidation — If you supply cell arrays
for labels
and scores
, then perfcurve
uses
crossvalidation and treats elements in the cell arrays as crossvalidation
folds. labels
can be a cell array of numeric
vectors, logical vectors, character matrices, cell arrays of strings,
or categorical vectors. All elements in labels
must
have the same type. scores
can be a cell array
of numeric vectors. The cell arrays for labels
and scores
must
have the same number of elements. The number of labels in cell j of labels
must
be equal to the number of scores in cell j of scores
for
any j in the range from 1 to the number of elements
in scores
.
Bootstrap — If you set NBoot
to
a positive integer n, perfcurve
generates n bootstrap
replicas to compute pointwise confidence bounds. If you use XCrit
or YCrit
to
set the criterion for X
or Y
to
an anonymous function, perfcurve
can compute
confidence bounds only using bootstrap.
perfcurve
estimates the confidence bounds
using one of two methods:
Vertical averaging (VA) — perfcurve
estimates
confidence bounds on Y
and T
at
fixed values of X
. That is, perfcurve
takes
samples of the ROC curves for fixed X
values,
averages the corresponding Y
and T
values,
and computes the standard errors. You can use the XVals
namevalue
pair argument to fix the X
values for computing
confidence bounds. If you do not specify XVals
,
then perfcurve
computes the confidence bounds
at all X
values.
Threshold averaging (TA) — perfcurve
takes
samples of the ROC curves at fixed thresholds T
for
the positive class score, averages the corresponding X
and Y
values,
and estimates the confidence bounds. You can use the TVals
namevalue
pair argument to use this method for computing confidence bounds.
If you set TVals
to 'all'
or
do not specify TVals
or XVals
,
then perfcurve
returns X
, Y
,
and T
values for all scores and computes pointwise
confidence bounds for Y
and X
using
threshold averaging.
When you compute the confidence bounds, Y
is
an mby3 array, where m is
the number of fixed X
values or thresholds (T
values).
The first column of Y
contains the mean value.
The second and third columns contain the lower bound and the upper
bound, respectively, of the pointwise confidence bounds. AUC
is
a row vector with three elements, following the same convention. If perfcurve
computes
the confidence bounds using VA, then T
is an mby3
matrix, and X
is a column vector. If perfcurve
uses
TA, then X
is an mby3 matrix
and T
is a columnvector.
perfcurve
returns pointwise confidence
bounds. It does not return a simultaneous confidence band for the
entire curve.
[1] T. Fawcett. "ROC Graphs: Notes and Practical Considerations for Researchers", 2004.
[2] Zweig, M., and G. Campbell. "ReceiverOperating Characteristic (ROC) Plots: A Fundamental Evaluation Tool in Clinical Medicine." Clin. Chem. 1993, 39/4, pp. 561–577 .
[3] Davis, J., and M. Goadrich. "The Relationship Between PrecisionRecall and ROC Curves." Proceedings of ICML '06, 2006, pp. 233–240.
[4] Moskowitz, C., and M. Pepe. "Quantifying and comparing the predictive accuracy of continuous prognostic factors for binary outcomes." Biostatistics, 2004, 5, pp. 113–127.
[5] Huang, Y., M. Pepe, and Z. Feng. "Evaluating the Predictiveness of a Continuous Marker." U. Washington Biostatistics Paper Series, 2006, 250–261.
[6] Briggs, W., and R. Zaretzki. "The Skill Plot: A Graphical Technique for Evaluating Continuous Diagnostic Tests." Biometrics, 2008, 63, pp. 250 – 261.
[7] R. Bettinger. "CostSensitive Classifier Selection Using the ROC Convex Hull Method." SAS Institute.