Compare Results for Regression and Tobit EAD Models
This example shows how to use fitEADModel to create a Regression model and a Tobit model for exposure at default (EAD) and then compare the results.
Load EAD Data
Load the EAD data.
load EADData.mat
head(EADData) UtilizationRate Age Marriage Limit Drawn EAD
_______________ ___ ___________ __________ __________ __________
0.24359 25 not married 44776 10907 44740
0.96946 44 not married 2.1405e+05 2.0751e+05 40678
0 40 married 1.6581e+05 0 1.6567e+05
0.53242 38 not married 1.7375e+05 92506 1593.5
0.2583 30 not married 26258 6782.5 54.175
0.17039 54 married 1.7357e+05 29575 576.69
0.18586 27 not married 19590 3641 998.49
0.85372 42 not married 2.0712e+05 1.7682e+05 1.6454e+05
rng('default'); NumObs = height(EADData); c = cvpartition(NumObs,'HoldOut',0.4); TrainingInd = training(c); TestInd = test(c);
Select Model Type
Select a Regression and a Tobit model type.
ModelTypeR ="Regression"; ModelTypeT =
"Tobit";
Select Conversion Measure
Select the conversion measure for the EAD response values.
ConversionMeasure =
"LCF";Create Regression EAD Model
Use fitEADModel to create a Regression model using the EADData.
eadModelRegression = fitEADModel(EADData,ModelTypeR,'PredictorVars',{'UtilizationRate','Age','Marriage'}, ... 'ConversionMeasure',ConversionMeasure,'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD'); disp(eadModelRegression);
Regression with properties:
ConversionTransform: "logit"
BoundaryTolerance: 1.0000e-07
ModelID: "Regression"
Description: ""
UnderlyingModel: [1×1 classreg.regr.CompactLinearModel]
PredictorVars: ["UtilizationRate" "Age" "Marriage"]
ResponseVar: "EAD"
LimitVar: "Limit"
DrawnVar: "Drawn"
ConversionMeasure: "lcf"
Display the underlying model. The underlying Regression model's response variable is the logit transformation of the EAD response data. Use the 'BoundaryTolerance', 'LimitVar', and 'DrawnVar' name-value arguments to modify the transformation.
disp(eadModelRegression.UnderlyingModel);
Compact linear regression model:
EAD_lcf_logit ~ 1 + UtilizationRate + Age + Marriage
Estimated Coefficients:
Estimate SE tStat pValue
_________ _________ _______ __________
(Intercept) -2.4745 0.29892 -8.2781 1.6448e-16
UtilizationRate 6.0045 0.19901 30.172 7.703e-182
Age -0.020095 0.0073019 -2.752 0.0059471
Marriage_not married -0.03509 0.13935 -0.2518 0.8012
Number of observations: 4378, Error degrees of freedom: 4374
Root Mean Squared Error: 4.48
R-squared: 0.173, Adjusted R-Squared: 0.173
F-statistic vs. constant model: 305, p-value = 5.7e-180
Create Tobit EAD Model
Use fitEADModel to create a Tobit model using the EADData.
eadModelTobit = fitEADModel(EADData,ModelTypeT,'PredictorVars',{'UtilizationRate','Age','Marriage'}, ... 'ConversionMeasure',ConversionMeasure,'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD','CensoringSide',"right",'LeftLimit',0.4,'RightLimit',0.5); disp(eadModelTobit);
Tobit with properties:
CensoringSide: "right"
LeftLimit: 0.4000
RightLimit: 0.5000
ModelID: "Tobit"
Description: ""
UnderlyingModel: [1×1 risk.internal.credit.TobitModel]
PredictorVars: ["UtilizationRate" "Age" "Marriage"]
ResponseVar: "EAD"
LimitVar: "Limit"
DrawnVar: "Drawn"
ConversionMeasure: "lcf"
Display the underlying model. The underlying Tobit model's response variable is the complog transformation of the EAD response data. Use the 'LimitVar', 'DrawnVar', 'CensoringSide', 'RightLimit', 'LeftLimit', and 'SolverOptions' name-value arguments to modify the transformation.
disp(eadModelTobit.UnderlyingModel);
Tobit regression model, right-censored:
EAD_lcf = min(Y*,0.5)
Y* ~ 1 + UtilizationRate + Age + Marriage
Estimated coefficients:
Estimate SE tStat pValue
__________ __________ ________ _________
(Intercept) 0.18088 0.021561 8.3892 0
UtilizationRate 0.42381 0.014146 29.961 0
Age -0.0014564 0.00052501 -2.774 0.0055599
Marriage_not married -0.0040192 0.012058 -0.33333 0.7389
(Sigma) 0.27917 0.0043151 64.696 0
Number of observations: 4378
Number of left-censored observations: 0
Number of uncensored observations: 2802
Number of right-censored observations: 1576
Log-likelihood: -1756.98
Predict EAD for Regression Model
EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict function with different options for the 'ModelLevel' name-vale argument.
predictedEADRegression = predict(eadModelRegression,EADData(TestInd,:),'ModelLevel','ead'); predictedConversionRegression = predict(eadModelRegression,EADData(TestInd,:),'ModelLevel','ConversionMeasure');
Predict EAD for Tobit Model
EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict function with different options for the 'ModelLevel' name-vale argument.
predictedEADTobit = predict(eadModelTobit,EADData(TestInd,:),'ModelLevel','ead'); predictedConversionTobit = predict(eadModelTobit,EADData(TestInd,:),'ModelLevel','ConversionMeasure');
Validate EAD Regression Model
For model validation of the Regression model, use modelDiscrimination, modelDiscriminationPlot, modelCalibration, and modelCalibrationPlot.
Use modelDiscrimination and then modelDiscriminationPlot to plot the ROC curve.
ModelLevel ="ConversionMeasure"; [DiscMeasureRegression, DiscDataRegression] = modelDiscrimination(eadModelRegression,EADData(TestInd,:),'ShowDetails',true,'ModelLevel',ModelLevel)
DiscMeasureRegression=1×3 table
AUROC Segment SegmentCount
_______ __________ ____________
Regression 0.70898 "all_data" 1751
DiscDataRegression=1534×3 table
X Y T
__________ _________ _______
0 0 0.95722
0 0.0027778 0.95722
0 0.0041667 0.9566
0 0.0055556 0.95639
0 0.0083333 0.95576
0.00096993 0.0097222 0.95555
0.00096993 0.016667 0.9549
0.0019399 0.016667 0.95474
0.0019399 0.018056 0.95468
0.0038797 0.018056 0.95403
0.0048497 0.019444 0.95381
0.0058196 0.019444 0.95314
0.0067895 0.020833 0.95291
0.0067895 0.022222 0.95233
0.0087294 0.026389 0.95224
0.0087294 0.031944 0.952
⋮
modelDiscriminationPlot(eadModelRegression,EADData(TestInd, :),'ModelLevel',ModelLevel,'SegmentBy','Marriage');

Use modelCalibration and then modelCalibrationPlot to show a scatter plot of the predictions.
YData ="Observed"; [CalMeasureRegression,CalDataRegression] = modelCalibration(eadModelRegression,EADData(TestInd,:),'ModelLevel',ModelLevel)
CalMeasureRegression=1×4 table
RSquared RMSE Correlation SampleMeanError
________ _______ ___________ _______________
Regression 0.16148 0.41023 0.40184 -0.025994
CalDataRegression=1751×3 table
Observed Predicted_Regression Residuals_Regression
__________ ____________________ ____________________
0.99919 0.17519 0.824
0.0020632 0.17343 -0.17137
0.03741 0.7527 -0.71529
0.75518 0.89867 -0.14349
0.00076139 0.042389 -0.041628
0.9998 0.95153 0.048274
0.0056134 0.1338 -0.12819
0.048451 0.043424 0.0050276
0.01448 0.059339 -0.044858
0.95329 0.67009 0.2832
0.97847 0.939 0.03947
0.71895 0.80122 -0.082271
0.79096 0.3791 0.41186
0.042816 0.52542 -0.4826
0.97169 0.2119 0.75979
0.99182 0.62543 0.36639
⋮
modelCalibrationPlot(eadModelRegression, EADData(TestInd,:), 'ModelLevel', ModelLevel, 'YData', YData);

Validate EAD Tobit Model
For model validation of the Tobit model, use modelDiscrimination, modelDiscriminationPlot, modelCalibration, and modelCalibrationPlot.
Use modelDiscrimination and then modelDiscriminationPlot to plot the ROC curve.
ModelLevel ="ConversionMeasure"; [DiscMeasureTobit,DiscDataTobit] = modelDiscrimination(eadModelTobit,EADData(TestInd,:),'ShowDetails',true,'ModelLevel',ModelLevel)
DiscMeasureTobit=1×3 table
AUROC Segment SegmentCount
_______ __________ ____________
Tobit 0.70909 "all_data" 1751
DiscDataTobit=1534×3 table
X Y T
__________ _________ _______
0 0 0.42178
0 0.0027778 0.42178
0 0.0041667 0.4212
0 0.0055556 0.42076
0.00096993 0.0069444 0.42062
0.00096993 0.0097222 0.42018
0.00096993 0.011111 0.42004
0.00096993 0.018056 0.4196
0.0019399 0.018056 0.4195
0.0029098 0.019444 0.41945
0.0048497 0.019444 0.41901
0.0058196 0.020833 0.41887
0.0058196 0.022222 0.41854
0.0067895 0.022222 0.41842
0.0067895 0.023611 0.41827
0.0067895 0.029167 0.41827
⋮
modelDiscriminationPlot(eadModelTobit,EADData(TestInd, :),'ModelLevel',ModelLevel,'SegmentBy','Marriage');

UsemodelCalibration and then modelCalibrationPlot. to show a scatter plot of the predictions.
YData ="Observed"; [CalMeasureTobit,CalDataTobit] = modelCalibration(eadModelTobit,EADData(TestInd,:),'ModelLevel',ModelLevel)
CalMeasureTobit=1×4 table
RSquared RMSE Correlation SampleMeanError
________ _______ ___________ _______________
Tobit 0.15929 0.39572 0.39911 0.13366
CalDataTobit=1751×3 table
Observed Predicted_Tobit Residuals_Tobit
__________ _______________ _______________
0.99919 0.21657 0.78261
0.0020632 0.21571 -0.21365
0.03741 0.35115 -0.31374
0.75518 0.39272 0.36245
0.00076139 0.12184 -0.12107
0.9998 0.41744 0.58237
0.0056134 0.19913 -0.19351
0.048451 0.12215 -0.073701
0.01448 0.14323 -0.12875
0.95329 0.33415 0.61914
0.97847 0.41069 0.56778
0.71895 0.3627 0.35624
0.79096 0.27467 0.51629
0.042816 0.30579 -0.26297
0.97169 0.23025 0.74144
0.99182 0.32461 0.66721
⋮
modelCalibrationPlot(eadModelTobit,EADData(TestInd,:),'ModelLevel',ModelLevel,'YData',YData);

Plot Histograms of Observed with Respect to Predicted EAD
Plot a histogram of observed with respect to the predicted EAD for the Regression model.
figure; histogram(CalDataRegression.Observed); hold on; histogram(CalDataRegression.(('Predicted_' + ModelTypeR))); legend('Observed','Predicted');

Plot a histogram of observed with respect to the predicted EAD for the Tobit model.
figure; histogram(CalDataTobit.Observed); hold on; histogram(CalDataTobit.(('Predicted_' + ModelTypeT))); legend('Observed','Predicted');

For both the Tobit and Regression models, the Age and UtilizationRate predictors are statistically significant, while the Marriage predictor is not statistically significant. Also, the Tobit and Regression models have different R-square values.
See Also
Regression | Tobit | fitEADModel | predict | modelDiscrimination | modelDiscriminationPlot | modelCalibration | modelCalibrationPlot





