Train a regression model and create a lime
object that uses a linear simple model. When you create a lime
object, if you do not specify a query point and the number of important predictors, then the software generates samples of a synthetic data set but does not fit a simple model. Use the object function fit
to fit a simple model for a query point. Then display the coefficients of the fitted linear simple model by using the object function plot
.
Load the carbig
data set, which contains measurements of cars made in the 1970s and early 1980s.
Create a table containing the predictor variables Acceleration
, Cylinders
, and so on, as well as the response variable MPG
.
Removing missing values in a training set can help reduce memory consumption and speed up training for the fitrkernel
function. Remove missing values in tbl
.
Create a table of predictor variables by removing the response variable from tbl
.
Train a blackbox model of MPG
by using the fitrkernel
function.
Create a lime
object. Specify a predictor data set because mdl
does not contain predictor data.
results =
lime with properties:
BlackboxModel: [1x1 RegressionKernel]
DataLocality: 'global'
CategoricalPredictors: [2 5]
Type: 'regression'
X: [392x6 table]
QueryPoint: []
NumImportantPredictors: []
NumSyntheticData: 5000
SyntheticData: [5000x6 table]
Fitted: [5000x1 double]
SimpleModel: []
ImportantPredictors: []
BlackboxFitted: []
SimpleModelFitted: []
results
contains the generated synthetic data set. The SimpleModel
property is empty ([]
).
Fit a linear simple model for the first observation in tblX
. Specify the number of important predictors to find as 3.
queryPoint=1×6 table
Acceleration Cylinders Displacement Horsepower Model_Year Weight
____________ _________ ____________ __________ __________ ______
12 8 307 130 70 3504
Plot the lime
object results
by using the object function plot
. To display an existing underscore in any predictor name, change the TickLabelInterpreter
value of the axes to 'none'
.
The plot displays two predictions for the query point, which correspond to the BlackboxFitted property and the SimpleModelFitted property of results
.
The horizontal bar graph shows the coefficient values of the simple model, sorted by their absolute values. LIME finds Horsepower
, Model_Year
, and Cylinders
as important predictors for the query point.
Model_Year
and Cylinders
are categorical predictors that have multiple categories. For a linear simple model, the software creates one less dummy variable than the number of categories for each categorical predictor. The bar graph displays only the most important dummy variable. You can check the coefficients of the other dummy variables using the SimpleModel
property of results
. Display the sorted coefficient values, including all categorical dummy variables.
ans=17×2 table
Exteded Predictor Name Coefficient
__________________________ ___________
{'Horsepower' } -3.4485e-05
{'Model_Year (74 vs. 70)'} -6.1279e-07
{'Model_Year (80 vs. 70)'} -4.015e-07
{'Model_Year (81 vs. 70)'} 3.4176e-07
{'Model_Year (82 vs. 70)'} -2.2483e-07
{'Cylinders (6 vs. 8)' } -1.9024e-07
{'Model_Year (76 vs. 70)'} 1.8136e-07
{'Cylinders (5 vs. 8)' } 1.7461e-07
{'Model_Year (71 vs. 70)'} 1.558e-07
{'Model_Year (75 vs. 70)'} 1.5456e-07
{'Model_Year (77 vs. 70)'} 1.521e-07
{'Model_Year (78 vs. 70)'} 1.4272e-07
{'Model_Year (72 vs. 70)'} 6.7001e-08
{'Model_Year (73 vs. 70)'} 4.7214e-08
{'Cylinders (4 vs. 8)' } 4.5118e-08
{'Model_Year (79 vs. 70)'} -2.2598e-08
⋮