kfoldLoss
Description
returns the loss (quantile loss) obtained by the cross-validated quantile regression model
L
= kfoldLoss(CVMdl
)CVMdl
. For every fold, kfoldLoss
computes the
loss for validation-fold observations using a model trained on training-fold observations.
CVMdl.X
and CVMdl.Y
contain both sets of
observations.
specifies additional options using one or more name-value arguments. For example, you can
specify the quantiles for which to return loss values.L
= kfoldLoss(CVMdl
,Name=Value
)
Examples
Compute the quantile loss for a quantile neural network regression model, first partitioned using holdout validation and then partitioned using 5-fold cross-validation. Compare the two losses.
Load the carbig
data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Acceleration
, Cylinders
, Displacement
, and so on, as well as the response variable MPG
. View the first eight observations.
load carbig cars = table(Acceleration,Cylinders,Displacement, ... Horsepower,Model_Year,Origin,Weight,MPG); head(cars)
Acceleration Cylinders Displacement Horsepower Model_Year Origin Weight MPG ____________ _________ ____________ __________ __________ _______ ______ ___ 12 8 307 130 70 USA 3504 18 11.5 8 350 165 70 USA 3693 15 11 8 318 150 70 USA 3436 18 12 8 304 150 70 USA 3433 16 10.5 8 302 140 70 USA 3449 17 10 8 429 198 70 USA 4341 15 9 8 454 220 70 USA 4354 14 8.5 8 440 215 70 USA 4312 14
Remove rows of cars
where the table has missing values.
cars = rmmissing(cars);
Categorize the cars based on whether they were made in the USA.
cars.Origin = categorical(cellstr(cars.Origin)); cars.Origin = mergecats(cars.Origin,["France","Japan",... "Germany","Sweden","Italy","England"],"NotUSA");
Partition the data using cvpartition
. First, create a partition for holdout validation, using approximately 80% of the observations for the training data and 20% for the test data. Then, create a partition for 5-fold cross-validation.
rng(0,"twister") % For reproducibility holdoutPartition = cvpartition(height(cars),Holdout=0.20); kfoldPartition = cvpartition(height(cars),KFold=5);
Train a quantile neural network regression model using the cars
data. Specify MPG
as the response variable, and standardize the numeric predictors. Use the default 0.5 quantile (median).
Mdl = fitrqnet(cars,"MPG",Standardize=true);
Create the partitioned quantile regression models using crossval
.
holdoutMdl = crossval(Mdl,CVPartition=holdoutPartition)
holdoutMdl = RegressionPartitionedQuantileModel CrossValidatedModel: 'QuantileNeuralNetwork' PredictorNames: {'Acceleration' 'Cylinders' 'Displacement' 'Horsepower' 'Model_Year' 'Origin' 'Weight'} CategoricalPredictors: 6 ResponseName: 'MPG' NumObservations: 392 KFold: 1 Partition: [1×1 cvpartition] ResponseTransform: 'none' Quantiles: 0.5000 Properties, Methods
kfoldMdl = crossval(Mdl,CVPartition=kfoldPartition)
kfoldMdl = RegressionPartitionedQuantileModel CrossValidatedModel: 'QuantileNeuralNetwork' PredictorNames: {'Acceleration' 'Cylinders' 'Displacement' 'Horsepower' 'Model_Year' 'Origin' 'Weight'} CategoricalPredictors: 6 ResponseName: 'MPG' NumObservations: 392 KFold: 5 Partition: [1×1 cvpartition] ResponseTransform: 'none' Quantiles: 0.5000 Properties, Methods
Compute the quantile loss for holdoutMdl
and kfoldMdl
by using the kfoldLoss
object function.
holdoutL = kfoldLoss(holdoutMdl)
holdoutL = 0.9488
kfoldL = kfoldLoss(kfoldMdl)
kfoldL = 0.9628
holdoutL
is the quantile loss computed using one holdout set, while kfoldL
is an average quantile loss computed using five holdout sets. Cross-validation metrics tend to be better indicators of a model's performance on unseen data.
Before computing the loss for a cross-validated quantile regression model, specify the prediction for observations with missing predictor values.
Load the carbig
data set, which contains measurements of cars made in the 1970s and early 1980s. Create a matrix X
containing the predictor variables Acceleration
, Displacement
, Horsepower
, and Weight
. Store the response variable MPG
in the variable Y
.
load carbig
X = [Acceleration,Displacement,Horsepower,Weight];
Y = MPG;
Train a cross-validated quantile linear regression model. Specify to use the 0.25, 0.50, and 0.75 quantiles (that is, the lower quartile, median, and upper quartile). To improve the model fit, change the beta tolerance to 1e-6 instead of the default value 1e-4, and use a ridge (L2) regularization term of 1. Specify 10-fold cross-validation by setting CrossVal="on"
.
rng(0,"twister") % For reproducibility CVMdl = fitrqlinear(X,Y,Quantiles=[0.25,0.50,0.75], ... BetaTolerance=1e-6,Lambda=1,CrossVal="on")
CVMdl = RegressionPartitionedQuantileModel CrossValidatedModel: 'QuantileLinear' PredictorNames: {'x1' 'x2' 'x3' 'x4'} ResponseName: 'Y' NumObservations: 398 KFold: 10 Partition: [1×1 cvpartition] ResponseTransform: 'none' Quantiles: [0.2500 0.5000 0.7500] Properties, Methods
CVMdl
is a RegressionPartitionedQuantileModel
.
Compute the quantile loss for each fold and quantile. Use a NaN
prediction for test set observations with missing predictor values.
L = kfoldLoss(CVMdl,Mode="individual",PredictionForMissingValue=NaN)
L = 10×3
1.5388 1.6703 1.3547
NaN NaN NaN
1.9140 2.1864 2.0922
NaN NaN NaN
1.4339 2.2040 1.7293
1.5513 1.9968 1.8037
NaN NaN NaN
1.3979 2.0011 2.0695
NaN NaN NaN
1.8021 2.2161 1.5746
The rows of L
correspond to folds, and the columns correspond to quantiles. The NaN
values in L
indicate that the data set includes observations with missing predictor values. For example, at least one of the observations in the second test set has a missing predictor value. You can find the predictor values for the observations in the second test set by using the following code.
test2Indices = test(CVMdl.Partition,2); test2Observations = CVMdl.X(test2Indices,:)
Instead of using a NaN
prediction for test set observations with missing predictor values, remove the observations from the computation.
newL = kfoldLoss(CVMdl,Mode="individual", ... PredictionForMissingValue="omitted")
newL = 10×3
1.5388 1.6703 1.3547
1.6612 2.1528 1.4820
1.9140 2.1864 2.0922
2.1431 2.6693 2.0767
1.4339 2.2040 1.7293
1.5513 1.9968 1.8037
1.2971 1.8850 1.8236
1.3979 2.0011 2.0695
1.6716 2.0485 1.5921
1.8021 2.2161 1.5746
Input Arguments
Cross-validated quantile regression model, specified as a RegressionPartitionedQuantileModel
object.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: kfoldLoss(CVMdl,Quantiles=[0.25 0.5 0.75])
specifies to
return the quantile loss for the 0.25, 0.5, and 0.75 quantiles.
Quantiles for which to compute the loss, specified as a vector of values in
CVMdl.Quantiles
. The software returns loss values only for the
quantiles specified in Quantiles
.
Example: Quantiles=[0.4 0.6]
Data Types: single
| double
| char
| string
Fold indices to use, specified as a positive integer vector. The elements of
Folds
must be within the range from 1
to
CVMdl.KFold
. The software uses only the folds specified in
Folds
.
Example: Folds=[1 4 10]
Data Types: single
| double
Loss function, specified as "quantile"
or a function handle.
"quantile"
— Quantile loss.Function handle — To specify a custom loss function, use a function handle. The function must have this form:
lossval = lossfun(Y,YFit,W,q)
The output argument
lossval
is a numeric scalar.You specify the function name (
).lossfun
Y
is a length-n numeric vector of observed responses.YFit
is a length-n numeric vector of corresponding predicted responses.W
is an n-by-1 numeric vector of observation weights.q
is a numeric scalar in the range [0,1] corresponding to a quantile.
Example: LossFun=@
lossfun
Data Types: char
| string
| function_handle
Aggregation level for the output, specified as "average"
or
"individual"
.
Value | Description |
---|---|
"average" | The output is a 1-by-q vector of loss values, averaged
over the folds specified by the Folds name-value
argument. q is the number of quantiles specified by the
Quantiles name-value argument. |
"individual" | The output is a k-by-q matrix of
loss values, where k is the number of folds specified by
the Folds name-value argument and q is
the number of quantiles specified by the Quantiles
name-value argument. |
Example: Mode="individual"
Data Types: char
| string
Predicted response value to use for observations with missing predictor values,
specified as "quantile"
, "omitted"
, a numeric
scalar, or a numeric vector.
Value | Description |
---|---|
"quantile" | kfoldLoss uses the specified quantile of the
observed response values in the training-fold data as the predicted response
value for observations with missing predictor values. |
"omitted" | kfoldLoss excludes observations with missing
predictor values from the loss computation. |
Numeric scalar or vector |
|
If an observation is missing an observed response value or an observation weight,
then kfoldLoss
does not use the observation in the loss
computation.
Example: PredictionForMissingValue="omitted"
Data Types: single
| double
| char
| string
Output Arguments
Loss, returned as a numeric row vector or numeric matrix. The loss is the
LossFun
loss between the validation-fold observations and the
predictions made with a quantile regression model trained on the training-fold observations.
If
Mode
is"average"
, thenL
is the average loss over the folds. That is,L
is a 1-by-q vector of loss values, averaged over the folds specified by theFolds
name-value argument. q is the number of quantiles specified by theQuantiles
name-value argument.If
Mode
is"individual"
, thenL
is a k-by-q matrix of loss values, where k is the number of folds specified by theFolds
name-value argument and q is the number of quantiles specified by theQuantiles
name-value argument.
Algorithms
kfoldLoss
computes losses according to the loss
object function of the trained compact models in CVMdl
(CVMdl.Trained
). For more information, see the model-specific
loss
function reference pages in the following table.
Model Type | loss Function |
---|---|
Quantile linear regression model | loss |
Quantile neural network model for regression | loss |
Version History
Introduced in R2025a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)