Tobit
Description
Create and analyze a Tobit
model object to calculate
the exposure at default (EAD) using this workflow:
Use
fitEADModel
to create aTobit
model object.Use
predict
to predict the EAD.Use
modelDiscrimination
to return AUROC and ROC data. You can plot the results usingmodelDiscriminationPlot
.Use
modelCalibration
to return the R-squared, RMSE, correlation, and sample mean error of predicted and observed EAD data. You can plot the results usingmodelCalibrationPlot
.
Creation
Description
specifies options using one or more name-value arguments in addition to the
input arguments in the previous syntax. The optional name-value arguments
set the model object properties. For example,
TobitEADModel
= fitEADModel(___,Name=Value
)eadModel =
fitEADModel(EADData,ModelType,PredictorVars={'UtilizationRate','Age','Marriage'},ConversionMeasure="ccf",DrawnVar='Drawn',LimitVar='Limit',ResponseVar='EAD')
creates an eadModel
object using a
Tobit
model type.
Input Arguments
data
— Data for exposure at default
table
Data for exposure at default, specified as a table.
Data Types: table
ModelType
— Model type
string with value "Tobit"
| character vector with value 'Tobit'
Model type, specified as a string with the value of
"Tobit"
or a character vector with the value
of 'Tobit'
.
Data Types: char
| string
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: eadModel =
fitEADModel(EADData,ModelType,PredictorVars={'UtilizationRate','Age','Marriage'},ConversionMeasure="ccf",DrawnVar='Drawn',LimitVar='Limit',ResponseVar='EAD')
ModelID
— User-defined model ID
"Tobit"
(default) | string | character vector
User-defined model ID, specified as ModelID
and a string or character vector. The software uses the
ModelID
text to format outputs and is
expected to be short.
Data Types: string
| char
Description
— User-defined description for model
""
(default) | string | character vector
User-defined description for model, specified as
Description
and a string or character
vector.
Data Types: string
| char
PredictorVars
— Predictor variables
all columns of data
except
for ResponseVar
(default) | string array | cell array of character vectors
Predictor variables, specified as
PredictorVars
and a string array or cell
array of character vectors. PredictorVars
indicates which columns in the data
input
contain the predictor information. By default,
PredictorVars
is set to all the columns in
the data
input except for
ResponseVar
.
Data Types: string
| cell
ResponseVar
— Response variable
last column of data
(default) | string | character vector
Response variable, specified as ResponseVar
and a string or character vector. The response variable contains the
EAD data and must be a numeric variable. By default,
ResponseVar
is set to the last column.
Data Types: string
| char
LimitVar
— Limit variable
string | character vector
Limit variable, specified as LimitVar
and a
string or character vector. LimitVar
indicates
which column in data
contains the limit amount.
The limit amount value in the data
must be a
positive numeric value. The limit depends on the loan. If its a
credit card, the limit is the credit limit, and if this is a
mortgage limit it is the initial loan amount. In general,
LimitVar
is the maximum amount that can be
borrowed.
Note
LimitVar
is required when
ConversionMeasure
is
'ccf'
or 'lcf'
. For
more information on CCF and LCF, see Conversion Measure Options.
Data Types: string
| char
DrawnVar
— Drawn variable
string | character vector
Drawn variable, specified as DrawnVar
and a
string or character vector. DrawnVar
is the
balance on the account at the time of observation, prior to default
and EAD is the balance at the time of default.
DrawnVar
indicates which column in
data
contains the drawn amount. The drawn
variable value in the data
can be a positive or
negative numeric value.
Note
DrawnVar
is required when
ConversionMeasure
is
'ccf'
.
If the ConversionMeasure
is
'lcf'
, DrawnVar
is not
required. In this case, DrawnVar
is set to
""
.
For more information on CCF, see Conversion Measure Options.
Data Types: string
| char
ConversionMeasure
— Conversion measure for EAD response values
"ccf"
(default) | character vector with value of 'ccf'
or
'lcf'
| string with value of "ccf"
or
"lcf"
Response transform, specified as
ConversionMeasure
and a character vector or string.
"ccf"
— Credit conversion factor (CCF) is the portion of the undrawn amount that will be converted into credit. The undrawn amount is the limit minus the drawn amount. The EAD thus becomes the drawn amount plus the CCF times the limit minus the drawn amount (EAD = Drawn + CCF*(Limit - Drawn)
).Note
A
Tobit
model with"ccf"
can be unstable."lcf"
— Limit conversion factor (LCF) is a fraction of the limit representing the total exposure. The EAD is then defined as the LCF times the limit (EAD = LCF*Limit
).
For more information on CCF and LCF, see Conversion Measure Options.
Data Types: string
| char
CensoringSide
— Censoring side
"both"
(default) | character vector with value of 'left'
,
'right'
, or
'both'
| string with value of "left"
,
"right"
, or
"both"
Censoring side, specified as CensoringSide
and a character vector or string. CensoringSide
indicates whether the desired Tobit model is left-censored,
right-censored, or censored on both sides.
Data Types: string
| char
LeftLimit
— Left-censoring limit
0
(default) | numeric between 0
and
1
Left-censoring limit, specified as
LeftLimit
and a scalar numeric between
0
and 1
.
Data Types: double
RightLimit
— Right-censoring limit
1
(default) | numeric between 0
and
1
Right-censoring limit, specified as
RightLimit
and a scalar numeric between
0
and 1
.
Data Types: double
SolverOptions
— optimoptions
object
object
Options for fitting, specified as
SolverOptions
and an
optimoptions
object that is created using
optimoptions
from
Optimization Toolbox™. The defaults for the optimoptions
object are:
"Display"
—"none"
"Algorithm"
—"sqp"
"MaxFunctionEvaluations"
—500
⨉ Number of model coefficients"MaxIterations"
— The number of Tobit model coefficients is determined at run time; it depends on the number of predictors and the number of categories in the categorical predictors.
Note
When using optimoptions
with a Tobit
model, specify the SolverName
as
fmincon
.
Data Types: object
Properties
ModelID
— User-defined model ID
Tobit
(default) | string
User-defined model ID, returned as a string.
Data Types: string
Description
— User-defined description
""
(default) | string
User-defined description, returned as a string.
Data Types: string
UnderlyingModel
— Underlying statistical model
compact linear model
This property is read-only.
Underlying statistical model, returned as a compact linear model
object. The compact version of the underlying regression model is an
instance of the classreg.regr.CompactLinearModel
class. For more information, see fitlm
and CompactLinearModel
.
Data Types: CompactLinearModel
PredictorVars
— Predictor variables
all columns of data
except for the
ResponseVar
(default) | string array
Predictor variables, returned as a string array.
Data Types: string
ResponseVar
— Response variable
last column of data
(default) | string
Response variable, returned as a string.
Data Types: string
LimitVar
— Limit variable
string
Limit variable, returned as a string.
Data Types: string
DrawnVar
— Drawn variable
string
Drawn variable, returned as a string.
Data Types: string
ConversionMeasure
— Conversion measure for EAD response values
"ccf"
(default) | string with value of "ccf"
or
"lcf"
Response transform, returned as a string.
Data Types: string
CensoringSide
— Censoring side
"both"
(default) | string with value of "left"
,
"right"
, or "both"
This property is read-only.
Censoring side, returned as a string.
Data Types: string
LeftLimit
— Left-censoring limit
0
(default) | numeric between 0
and 1
This property is read-only.
Left-censoring limit, returned as a scalar numeric between
0
and 1
.
Data Types: double
RightLimit
— Right-censoring limit
1
(default) | numeric between 0
and 1
This property is read-only.
Right-censoring limit, returned as a scalar numeric between
0
and 1
.
Data Types: double
Object Functions
predict | Predict exposure at default |
modelDiscrimination | Compute AUROC and ROC data |
modelDiscriminationPlot | Plot ROC curve |
modelCalibration | Compute R-square, RMSE, correlation, and sample mean error of predicted and observed EADs |
modelCalibrationPlot | Scatter plot of predicted and observed EADs |
Examples
Create Tobit EAD Model
This example shows how to use fitEADModel
to create a Tobit
model for exposure at default (EAD).
Load EAD Data
Load the EAD data.
load EADData.mat
head(EADData)
UtilizationRate Age Marriage Limit Drawn EAD _______________ ___ ___________ __________ __________ __________ 0.24359 25 not married 44776 10907 44740 0.96946 44 not married 2.1405e+05 2.0751e+05 40678 0 40 married 1.6581e+05 0 1.6567e+05 0.53242 38 not married 1.7375e+05 92506 1593.5 0.2583 30 not married 26258 6782.5 54.175 0.17039 54 married 1.7357e+05 29575 576.69 0.18586 27 not married 19590 3641 998.49 0.85372 42 not married 2.0712e+05 1.7682e+05 1.6454e+05
rng('default'); NumObs = height(EADData); c = cvpartition(NumObs,'HoldOut',0.4); TrainingInd = training(c); TestInd = test(c);
Select Model Type
Select a model type for Tobit
or Regression
.
ModelType = "Tobit";
Select Conversion Measure
Select a conversion measure for the EAD response values.
ConversionMeasure = "LCF";
Create Tobit
EAD Model
Use fitEADModel
to create a Tobit
model using the EADData
.
eadModel = fitEADModel(EADData,ModelType,PredictorVars={'UtilizationRate','Age','Marriage'}, ... ConversionMeasure=ConversionMeasure,DrawnVar="Drawn",LimitVar="Limit",ResponseVar="EAD"); disp(eadModel);
Tobit with properties: CensoringSide: "both" LeftLimit: 0 RightLimit: 1 ModelID: "Tobit" Description: "" UnderlyingModel: [1x1 risk.internal.credit.TobitModel] PredictorVars: ["UtilizationRate" "Age" "Marriage"] ResponseVar: "EAD" LimitVar: "Limit" DrawnVar: "Drawn" ConversionMeasure: "lcf"
Display the underlying model. The underlying model's response variable is the transformation of the EAD response data. Use the 'LimitVar'
and 'DrawnVar'
name-value arguments to modify the transformation.
disp(eadModel.UnderlyingModel);
Tobit regression model: EAD_lcf = max(0,min(Y*,1)) Y* ~ 1 + UtilizationRate + Age + Marriage Estimated coefficients: Estimate SE tStat pValue __________ __________ ________ ________ (Intercept) 0.22735 0.026172 8.6868 0 UtilizationRate 0.47364 0.016412 28.859 0 Age -0.0013929 0.00063689 -2.1871 0.028789 Marriage_not married -0.0068879 0.012314 -0.55936 0.57594 (Sigma) 0.36419 0.00388 93.864 0 Number of observations: 4378 Number of left-censored observations: 0 Number of uncensored observations: 4377 Number of right-censored observations: 1 Log-likelihood: -1791.06
Predict EAD
EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict
function with different options for the 'ModelLevel'
name-vale argument.
predictedEAD = predict(eadModel,EADData(TestInd,:),ModelLevel="ead"); predictedConversion = predict(eadModel,EADData(TestInd,:),ModelLevel="ConversionMeasure");
Validate EAD Model
For model validation, use modelDiscrimination
, modelDiscriminationPlot
, modelCalibration
, and modelCalibrationPlot
.
Use modelDiscrimination
and then modelDiscriminationPlot
to plot the ROC curve.
ModelLevel = "ConversionMeasure"; [DiscMeasure1,DiscData1] = modelDiscrimination(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel); modelDiscriminationPlot(eadModel,EADData(TestInd, :),ModelLevel=ModelLevel,SegmentBy="Marriage");
Use modelCalibration
and then modelCalibrationPlot
to show a scatter plot of the predictions.
YData = "Observed";
[CalMeasure1,CalData1] = modelCalibration(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel);
modelCalibrationPlot(eadModel,EADData(TestInd,:),ModelLevel=ModelLevel,YData=YData);
Plot a histogram of observed with respect to the predicted EAD.
figure; histogram(CalData1.Observed); hold on; histogram(CalData1.(('Predicted_' + ModelType))); legend('Observed','Predicted');
More About
Exposure at Default Tobit Models
The exposure at default (EAD) Tobit models fit a Tobit model to EAD data.
Tobit models are "censored" regression models. Tobit models assume that the
response variable can be observed only within certain limits, and no value outside
the limits can be observed. Using ModelLevel
, you can set the
Tobit model level to EAD
, CCF
, or
LCF
conversion measures. The EAD
model
level does not have any range, the CCF
conversion measure has a
range of -Inf
to 1
, and the
LCF
conversion measure is 0
to
1
. A distribution of response values where there is a high
frequency of observations at the limits is consistent with the model
assumptions.
The Tobit model combines the following two formulas:
where
Y is the observed response variable, the observed EAD data for an EAD model.
L is the left limit, the lower bound for the response values, typically
0
for EAD models.R is the right limit, the upper bound for the response values, typically
1
for EAD models.Y* is a latent, unobserved variable.
βj is the coefficient of the jth predictor (or the intercept for j =
0
).σ is the standard deviation of the error term.
ϵ is the error term, assumed to follow a standard normal distribution.
The first formula above is written using min
and
max
operators and is equivalent to
The standard deviation of the error is explicitly indicated in the formulas. Unlike traditional regression least-squares estimation, where the standard deviation of the error can be inferred from the residuals, for Tobit models the estimation is via maximum likelihood and the standard deviation needs to be handled explicitly during the estimation. If there are p predictor variables, the Tobit model estimates p+2 coefficients, namely, one coefficient for each predictor, plus an intercept, plus a standard deviation.
Three censoring side options are supported in the Tobit EAD models with the
CensoringSide
name-value argument:
'both'
— This option is the default option, with censoring on both sides. The estimation uses left and right limits.'left'
— The left-censored version of the model has no right limit (or R = ∞). The relationship between Y and Y* is Y =max
â¡{L,Y* }.'right'
— The right-censored version of the model has no left limit (or L = -∞). The relationship between Y and Y* is Y =min
{Y*,R}.
The parameters of the Tobit model are estimated using maximum likelihood. For observation i = 1,...,n, the likelihood function is
where
Φ(x;m,s) is the cumulative normal distribution with mean m and standard deviation s.
φ(x;m,s) is the normal density function with mean m and standard deviation s.
This likelihood function is for models censored on both sides. For left-censored models, the right limit has no effect, and the likelihood function has two cases only (R = ∞); likewise for right-censored models (L = -∞).
The log-likelihood function is the sum of the logarithm of the likelihood functions for individual observations
The parameters are estimated by maximizing the log-likelihood function. The only constraint is that the σ parameter must be positive.
To predict an EAD value, Tobit EAD models return the unconditional expected value of the response, given the predictor values
The expression for the expected value can be separated into the cases
Using the previous expression and the properties of the (truncated) normal distribution, it follows that
where
This expression applies to the models censored on both sides. For models censored on one side only, the corresponding expressions can be derived from here. For example, for left-censored models, let the R limit in the expression above go to infinity, and the resulting expression is
Similarly, for right-censored models, the L limit is decreased to minus infinity to get
Conversion Measure Options
You can relate the EAD to a scaling variable and derive
conversion measures like credit conversion factor (CCF) and limit conversion factor
(LCF) using the 'ccf'
or 'lcf'
options for the
ConversionMeasure
name-value argument.
The following table summarizes the supported transformations using the
'ccf'
or 'lcf'
options for the
ConversionMeasure
name-value argument:
Measure | EAD Formula | Lower Bound | Upper Bound | Inverse Transformation |
---|---|---|---|---|
CCF | EAD = Drawn + CCF × (Limit -
Drawn) | -Inf | 1 | CCF = 1 - e(-
CCFt) |
LCF | EAD = LCF ⨉ Limit | 0 | 1 | LCF = eLCFt
∕ (1 +
eLCFt) |
References
[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.
[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.
[3] Brown, Iain. Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT: Theory and Applications. SAS Institute, 2014.
[4] Roesch, Daniel and Harald Scheule. Deep Credit Risk. Independently published, 2020.
Version History
Introduced in R2021bR2023a: modelAccuracy
object function is renamed to modelCalibration
function
The modelAccuracy
object function is renamed to
modelCalibration
function. The use of
modelAccuracy
is discouraged, use modelCalibration
instead.
R2023a: modelAccuracyPlot
object function is renamed to modelCalibrationPlot
function
The modelAccuracyPlot
object function is renamed to
modelCalibrationPlot
function. The use of
modelAccuracyPlot
is discouraged, use modelCalibrationPlot
instead.
See Also
Functions
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)