Main Content

kfoldfun

Cross-validate function for quantile regression

Since R2025a

    Description

    vals = kfoldfun(CVMdl,fun) cross-validates the function fun by applying it to the data stored in the cross-validated model CVMdl. You must pass fun as a function handle.

    example

    Examples

    collapse all

    Create a cross-validated quantile regression model. Compute the cross-validation quantile loss. Then, compute the quantile loss using training set quantiles instead of predictions.

    Simulate 1000 observations from the model y=1+0.05x+sin(x)/x+ϵ where:

    • x is a 1000-by-1 vector of evenly spaced values between –10 and 10.

    • ϵ is a 1000-by-1 vector of random normal errors with mean 0 and standard deviation 0.2.

    rng("default"); % For reproducibility
    n = 1000;
    x = linspace(-10,10,n)';
    y = 1 + 0.05*x + sin(x)./x + 0.2*randn(n,1);

    Create a 5-fold cross-validated quantile neural network regression model. Use the 0.05, 0.5, and 0.95 quantiles.

    CVMdl = fitrqnet(x,y,Quantiles=[0.05 0.5 0.95],KFold=5)
    CVMdl = 
      RegressionPartitionedQuantileModel
        CrossValidatedModel: 'QuantileNeuralNetwork'
             PredictorNames: {'x1'}
               ResponseName: 'Y'
            NumObservations: 1000
                      KFold: 5
                  Partition: [1×1 cvpartition]
          ResponseTransform: 'none'
                  Quantiles: [0.0500 0.5000 0.9500]
    
    
      Properties, Methods
    
    

    CVMdl is a RegressionPartitionedQuantileModel object that contains five trained CompactRegressionQuantileNeuralNetwork model objects (CVMdl.Trained).

    Compute the cross-validation quantile loss.

    L = kfoldLoss(CVMdl)
    L = 1×3
    
        0.0230    0.0876    0.0229
    
    

    Each value in L corresponds to one quantile. For example, the first value L(1) is the quantile loss for the 0.05 quantile, averaged across the five folds.

    Find the quantile loss when you use training set quantiles instead of test set predictions to compute residuals.

    First, create the customQuantileLoss function. The function takes in a compact quantile regression model, training data, and test data, and returns the custom quantile loss. The residuals are defined as the difference between the test set responses and the training set quantiles, instead of the difference between the test set responses and the predicted test set responses.

    function loss = customQuantileLoss(CMP,Xtrain,Ytrain,Wtrain, ...
        Xtest,Ytest,Wtest)
    
        residuals = Ytest - quantile(Ytrain,CMP.Quantiles);
        loss = residuals.*(CMP.Quantiles - (residuals<0));
        loss = sum(Wtest.*loss)/sum(Wtest);
    
    end

    To replicate the quantile loss used to compute L, you can use the following residual definition instead.

    residuals = Ytest - predict(CMP,Xtest,Quantiles=CMP.Quantiles);
    

    After creating the customQuantileLoss function, pass the function to kfoldfun, along with the cross-validated model CVMdl. Average the results over the five folds.

    customL = mean(kfoldfun(CVMdl,@customQuantileLoss))
    customL = 1×3
    
        0.0436    0.2131    0.0484
    
    

    The customL loss values are greater than the L loss values.

    Input Arguments

    collapse all

    Cross-validated quantile regression model, specified as a RegressionPartitionedQuantileModel object.

    Cross-validated function, specified as a function handle. fun has the syntax:

    testvals = fun(CMP,Xtrain,Ytrain,Wtrain,Xtest,Ytest,Wtest)
    • CMP is a compact model stored in one element of the CVMdl.Trained property.

    • Xtrain is the training matrix of predictor values.

    • Ytrain is the training array of response values.

    • Wtrain contains the training weights for the observations.

    • Xtest and Ytest are the test data, with associated weights Wtest.

    • The returned value testvals must have the same size across all folds.

    Data Types: function_handle

    Output Arguments

    collapse all

    Cross-validation results, returned as a numeric matrix. vals contains the arrays of testvals output returned by fun, concatenated vertically over all folds. For example, if the testvals output from every fold is a numeric vector of length q, then kfoldfun returns a CVMdl.KFold-by-q numeric matrix with one row per fold.

    Data Types: double

    Version History

    Introduced in R2025a