Compare Results for Regression and Tobit EAD Models
This example shows how to use fitEADModel
to create a Regression
model and a Tobit
model for exposure at default (EAD) and then compare the results.
Load EAD Data
Load the EAD data.
load EADData.mat
head(EADData)
UtilizationRate Age Marriage Limit Drawn EAD _______________ ___ ___________ __________ __________ __________ 0.24359 25 not married 44776 10907 44740 0.96946 44 not married 2.1405e+05 2.0751e+05 40678 0 40 married 1.6581e+05 0 1.6567e+05 0.53242 38 not married 1.7375e+05 92506 1593.5 0.2583 30 not married 26258 6782.5 54.175 0.17039 54 married 1.7357e+05 29575 576.69 0.18586 27 not married 19590 3641 998.49 0.85372 42 not married 2.0712e+05 1.7682e+05 1.6454e+05
rng('default'); NumObs = height(EADData); c = cvpartition(NumObs,'HoldOut',0.4); TrainingInd = training(c); TestInd = test(c);
Select Model Type
Select a Regression
and a Tobit
model type.
ModelTypeR = "Regression"; ModelTypeT = "Tobit";
Select Conversion Measure
Select the conversion measure for the EAD response values.
ConversionMeasure = "LCF";
Create Regression
EAD Model
Use fitEADModel
to create a Regression
model using the EADData
.
eadModelRegression = fitEADModel(EADData,ModelTypeR,'PredictorVars',{'UtilizationRate','Age','Marriage'}, ... 'ConversionMeasure',ConversionMeasure,'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD'); disp(eadModelRegression);
Regression with properties: ConversionTransform: "logit" BoundaryTolerance: 1.0000e-07 ModelID: "Regression" Description: "" UnderlyingModel: [1x1 classreg.regr.CompactLinearModel] PredictorVars: ["UtilizationRate" "Age" "Marriage"] ResponseVar: "EAD" LimitVar: "Limit" DrawnVar: "Drawn" ConversionMeasure: "lcf"
Display the underlying model. The underlying Regression
model's response variable is the logit
transformation of the EAD response data. Use the 'BoundaryTolerance'
, 'LimitVar'
, and 'DrawnVar'
name-value arguments to modify the transformation.
disp(eadModelRegression.UnderlyingModel);
Compact linear regression model: EAD_lcf_logit ~ 1 + UtilizationRate + Age + Marriage Estimated Coefficients: Estimate SE tStat pValue _________ _________ _______ __________ (Intercept) -2.4745 0.29892 -8.2781 1.6448e-16 UtilizationRate 6.0045 0.19901 30.172 7.703e-182 Age -0.020095 0.0073019 -2.752 0.0059471 Marriage_not married -0.03509 0.13935 -0.2518 0.8012 Number of observations: 4378, Error degrees of freedom: 4374 Root Mean Squared Error: 4.48 R-squared: 0.173, Adjusted R-Squared: 0.173 F-statistic vs. constant model: 305, p-value = 5.7e-180
Create Tobit
EAD Model
Use fitEADModel
to create a Tobit
model using the EADData
.
eadModelTobit = fitEADModel(EADData,ModelTypeT,'PredictorVars',{'UtilizationRate','Age','Marriage'}, ... 'ConversionMeasure',ConversionMeasure,'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD','CensoringSide',"right",'LeftLimit',0.4,'RightLimit',0.5); disp(eadModelTobit);
Tobit with properties: CensoringSide: "right" LeftLimit: 0.4000 RightLimit: 0.5000 ModelID: "Tobit" Description: "" UnderlyingModel: [1x1 risk.internal.credit.TobitModel] PredictorVars: ["UtilizationRate" "Age" "Marriage"] ResponseVar: "EAD" LimitVar: "Limit" DrawnVar: "Drawn" ConversionMeasure: "lcf"
Display the underlying model. The underlying Tobit
model's response variable is the complog
transformation of the EAD response data. Use the 'LimitVar'
, 'DrawnVar'
, 'CensoringSide'
, 'RightLimit'
, 'LeftLimit'
, and 'SolverOptions'
name-value arguments to modify the transformation.
disp(eadModelTobit.UnderlyingModel);
Tobit regression model, right-censored: EAD_lcf = min(Y*,0.5) Y* ~ 1 + UtilizationRate + Age + Marriage Estimated coefficients: Estimate SE tStat pValue __________ _________ ________ _________ (Intercept) 0.18088 0.021541 8.3972 0 UtilizationRate 0.42381 0.014164 29.921 0 Age -0.0014564 0.0005244 -2.7772 0.0055057 Marriage_not married -0.0040192 0.012014 -0.33454 0.73799 (Sigma) 0.27917 0.0043096 64.779 0 Number of observations: 4378 Number of left-censored observations: 0 Number of uncensored observations: 2802 Number of right-censored observations: 1576 Log-likelihood: -1756.98
Predict EAD for Regression
Model
EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict
function with different options for the 'ModelLevel'
name-vale argument.
predictedEADRegression = predict(eadModelRegression,EADData(TestInd,:),'ModelLevel','ead'); predictedConversionRegression = predict(eadModelRegression,EADData(TestInd,:),'ModelLevel','ConversionMeasure');
Predict EAD for Tobit
Model
EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict
function with different options for the 'ModelLevel'
name-vale argument.
predictedEADTobit = predict(eadModelTobit,EADData(TestInd,:),'ModelLevel','ead'); predictedConversionTobit = predict(eadModelTobit,EADData(TestInd,:),'ModelLevel','ConversionMeasure');
Validate EAD Regression
Model
For model validation of the Regression
model, use modelDiscrimination
, modelDiscriminationPlot
, modelCalibration
, and modelCalibrationPlot
.
Use modelDiscrimination
and then modelDiscriminationPlot
to plot the ROC curve.
ModelLevel = "ConversionMeasure"; [DiscMeasureRegression, DiscDataRegression] = modelDiscrimination(eadModelRegression,EADData(TestInd,:),'ShowDetails',true,'ModelLevel',ModelLevel)
DiscMeasureRegression=1×3 table
AUROC Segment SegmentCount
_______ __________ ____________
Regression 0.70898 "all_data" 1751
DiscDataRegression=1534×3 table
X Y T
__________ _________ _______
0 0 0.95722
0 0.0027778 0.95722
0 0.0041667 0.9566
0 0.0055556 0.95639
0 0.0083333 0.95576
0.00096993 0.0097222 0.95555
0.00096993 0.016667 0.9549
0.0019399 0.016667 0.95474
0.0019399 0.018056 0.95468
0.0038797 0.018056 0.95403
0.0048497 0.019444 0.95381
0.0058196 0.019444 0.95314
0.0067895 0.020833 0.95291
0.0067895 0.022222 0.95233
0.0087294 0.026389 0.95224
0.0087294 0.031944 0.952
⋮
modelDiscriminationPlot(eadModelRegression,EADData(TestInd, :),'ModelLevel',ModelLevel,'SegmentBy','Marriage');
Use modelCalibration
and then modelCalibrationPlot
to show a scatter plot of the predictions.
YData = "Observed"; [CalMeasureRegression,CalDataRegression] = modelCalibration(eadModelRegression,EADData(TestInd,:),'ModelLevel',ModelLevel)
CalMeasureRegression=1×4 table
RSquared RMSE Correlation SampleMeanError
________ _______ ___________ _______________
Regression 0.16148 0.41023 0.40184 -0.025994
CalDataRegression=1751×3 table
Observed Predicted_Regression Residuals_Regression
__________ ____________________ ____________________
0.99919 0.17519 0.824
0.0020632 0.17343 -0.17137
0.03741 0.7527 -0.71529
0.75518 0.89867 -0.14349
0.00076139 0.042389 -0.041628
0.9998 0.95153 0.048274
0.0056134 0.1338 -0.12819
0.048451 0.043424 0.0050276
0.01448 0.059339 -0.044858
0.95329 0.67009 0.2832
0.97847 0.939 0.03947
0.71895 0.80122 -0.082271
0.79096 0.3791 0.41186
0.042816 0.52542 -0.4826
0.97169 0.2119 0.75979
0.99182 0.62543 0.36639
⋮
modelCalibrationPlot(eadModelRegression, EADData(TestInd,:), 'ModelLevel', ModelLevel, 'YData', YData);
Validate EAD Tobit
Model
For model validation of the Tobit
model, use modelDiscrimination
, modelDiscriminationPlot
, modelCalibration
, and modelCalibrationPlot
.
Use modelDiscrimination
and then modelDiscriminationPlot
to plot the ROC curve.
ModelLevel = "ConversionMeasure"; [DiscMeasureTobit,DiscDataTobit] = modelDiscrimination(eadModelTobit,EADData(TestInd,:),'ShowDetails',true,'ModelLevel',ModelLevel)
DiscMeasureTobit=1×3 table
AUROC Segment SegmentCount
_______ __________ ____________
Tobit 0.70909 "all_data" 1751
DiscDataTobit=1534×3 table
X Y T
__________ _________ _______
0 0 0.42178
0 0.0027778 0.42178
0 0.0041667 0.4212
0 0.0055556 0.42076
0.00096993 0.0069444 0.42062
0.00096993 0.0097222 0.42018
0.00096993 0.011111 0.42004
0.00096993 0.018056 0.4196
0.0019399 0.018056 0.4195
0.0029098 0.019444 0.41945
0.0048497 0.019444 0.41901
0.0058196 0.020833 0.41887
0.0058196 0.022222 0.41854
0.0067895 0.022222 0.41842
0.0067895 0.023611 0.41827
0.0067895 0.029167 0.41827
⋮
modelDiscriminationPlot(eadModelTobit,EADData(TestInd, :),'ModelLevel',ModelLevel,'SegmentBy','Marriage');
UsemodelCalibration
and then modelCalibrationPlot
. to show a scatter plot of the predictions.
YData = "Observed"; [CalMeasureTobit,CalDataTobit] = modelCalibration(eadModelTobit,EADData(TestInd,:),'ModelLevel',ModelLevel)
CalMeasureTobit=1×4 table
RSquared RMSE Correlation SampleMeanError
________ _______ ___________ _______________
Tobit 0.15929 0.39572 0.39911 0.13366
CalDataTobit=1751×3 table
Observed Predicted_Tobit Residuals_Tobit
__________ _______________ _______________
0.99919 0.21657 0.78261
0.0020632 0.21571 -0.21365
0.03741 0.35115 -0.31374
0.75518 0.39272 0.36245
0.00076139 0.12184 -0.12107
0.9998 0.41744 0.58237
0.0056134 0.19913 -0.19351
0.048451 0.12215 -0.073701
0.01448 0.14323 -0.12875
0.95329 0.33415 0.61914
0.97847 0.41069 0.56778
0.71895 0.3627 0.35624
0.79096 0.27467 0.51629
0.042816 0.30579 -0.26297
0.97169 0.23025 0.74144
0.99182 0.32461 0.66721
⋮
modelCalibrationPlot(eadModelTobit,EADData(TestInd,:),'ModelLevel',ModelLevel,'YData',YData);
Plot Histograms of Observed with Respect to Predicted EAD
Plot a histogram of observed with respect to the predicted EAD for the Regression
model.
figure; histogram(CalDataRegression.Observed); hold on; histogram(CalDataRegression.(('Predicted_' + ModelTypeR))); legend('Observed','Predicted');
Plot a histogram of observed with respect to the predicted EAD for the Tobit
model.
figure; histogram(CalDataTobit.Observed); hold on; histogram(CalDataTobit.(('Predicted_' + ModelTypeT))); legend('Observed','Predicted');
For both the Tobit
and Regression
models, the Age
and UtilizationRate
predictors are statistically significant, while the Marriage
predictor is not statistically significant. Also, the Tobit
and Regression
models have different R-square values.
See Also
Regression
| Tobit
| fitEADModel
| predict
| modelDiscrimination
| modelDiscriminationPlot
| modelCalibration
| modelCalibrationPlot