loss

Regression loss for Gaussian kernel regression model

Description

example

L = loss(Mdl,X,Y) returns the mean squared error (MSE) for the Gaussian kernel regression model Mdl using the predictor data in X and the corresponding responses in Y.

L = loss(Mdl,Tbl,ResponseVarName) returns the MSE for the model Mdl using the predictor data in Tbl and the true responses in Tbl.ResponseVarName.

L = loss(Mdl,Tbl,Y) returns the MSE for the model Mdl using the predictor data in table Tbl and the true responses in Y.

example

L = loss(___,Name,Value) specifies options using one or more name-value pair arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify a regression loss function and observation weights. Then, loss returns the weighted regression loss using the specified loss function.

Examples

collapse all

Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.

When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer function.

mapreducer(0)

Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat 'NA' values as missing data so that datastore replaces them with NaN values. Select a subset of the variables to use. Create a tall table on top of the datastore.

varnames = {'ArrTime','DepTime','ActualElapsedTime'};
ds = datastore('airlinesmall.csv','TreatAsMissing','NA',...
'SelectedVariableNames',varnames);
t = tall(ds);

Specify DepTime and ArrTime as the predictor variables (X) and ActualElapsedTime as the response variable (Y). Select the observations for which ArrTime is later than DepTime.

daytime = t.ArrTime>t.DepTime;
Y = t.ActualElapsedTime(daytime);     % Response data
X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data

Standardize the predictor variables.

Z = zscore(X); % Standardize the data

Train a default Gaussian kernel regression model with the standardized predictors. Set 'Verbose',0 to suppress diagnostic messages.

[Mdl,FitInfo] = fitrkernel(Z,Y,'Verbose',0)
Mdl =
RegressionKernel
PredictorNames: {'x1'  'x2'}
ResponseName: 'Y'
Learner: 'svm'
NumExpansionDimensions: 64
KernelScale: 1
Lambda: 8.5385e-06
BoxConstraint: 1
Epsilon: 5.9303

Properties, Methods

FitInfo = struct with fields:
Solver: 'LBFGS-tall'
LossFunction: 'epsiloninsensitive'
Lambda: 8.5385e-06
BetaTolerance: 1.0000e-03
ObjectiveValue: 30.7814
RelativeChangeInBeta: 0.0228
FitTime: 58.3135
History: []

Mdl is a trained RegressionKernel model, and the structure array FitInfo contains optimization details.

Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.

lossMSE = loss(Mdl,Z,Y) % Resubstitution mean squared error
lossMSE =

MxNx... tall array

?    ?    ?    ...
?    ?    ?    ...
?    ?    ?    ...
:    :    :
:    :    :
lossEI = loss(Mdl,Z,Y,'LossFun','epsiloninsensitive') % Resubstitution epsilon-insensitive error
lossEI =

MxNx... tall array

?    ?    ?    ...
?    ?    ?    ...
?    ?    ?    ...
:    :    :
:    :    :

Evaluate the tall arrays and bring the results into memory by using gather.

[lossMSE,lossEI] = gather(lossMSE,lossEI)
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 1.5 sec
Evaluation completed in 1.9 sec
lossMSE = 2.8851e+03
lossEI = 28.0050

Specify a custom regression loss (Huber loss) for a Gaussian kernel regression model.

Specify the predictor variables (X) and the response variable (Y).

X = [Weight,Cylinders,Horsepower,Model_Year];
Y = MPG;

Delete rows of X and Y where either array has NaN values. Removing rows with NaN values before passing data to fitrkernel can speed up training and reduce memory usage.

R = rmmissing([X Y]);
X = R(:,1:4);
Y = R(:,end);

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10)  % For reproducibility
N = length(Y);
cvp = cvpartition(N,'Holdout',0.1);
idxTrn = training(cvp); % Training set indices
idxTest = test(cvp);    % Test set indices

Standardize the training data and train the regression kernel model.

Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
[Ztrain,tr_mu,tr_sigma] = zscore(Xtrain); % Standardize the training data
tr_sigma(tr_sigma==0) = 1;
Mdl = fitrkernel(Ztrain,Ytrain)
Mdl =
RegressionKernel
ResponseName: 'Y'
Learner: 'svm'
NumExpansionDimensions: 128
KernelScale: 1
Lambda: 0.0028
BoxConstraint: 1
Epsilon: 0.8617

Properties, Methods

Mdl is a RegressionKernel model.

Create an anonymous function that measures Huber loss $\left(\delta =1\right)$, that is,

$L=\frac{1}{\sum {w}_{j}}\sum _{j=1}^{n}{w}_{j}{\ell }_{j},$

where

$\begin{array}{l}\\ {\ell }_{j}=\left\{\begin{array}{c}0.5{\underset{}{\overset{ˆ}{{e}_{j}}}}^{2};\\ |\underset{}{\overset{ˆ}{{e}_{j}}}|-0.5;\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\end{array}\begin{array}{c}\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}|\underset{}{\overset{ˆ}{{e}_{j}}}|\le 1\\ \phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}|\underset{}{\overset{ˆ}{{e}_{j}}}|>1\end{array}.\end{array}$

$\underset{}{\overset{ˆ}{{e}_{j}}}$ is the residual for observation j. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the 'LossFun' name-value pair argument.

huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ...
((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);

Estimate the training set regression loss using the Huber loss function.

eTrain = loss(Mdl,Ztrain,Ytrain,'LossFun',huberloss)
eTrain = 1.7210

Standardize the test data using the same mean and standard deviation of the training data columns. Estimate the test set regression loss using the Huber loss function.

Xtest = X(idxTest,:);
Ztest = (Xtest-tr_mu)./tr_sigma; % Standardize the test data
Ytest = Y(idxTest);

eTest = loss(Mdl,Ztest,Ytest,'LossFun',huberloss)
eTest = 1.3062

Input Arguments

collapse all

Kernel regression model, specified as a RegressionKernel model object. You can create a RegressionKernel model object using fitrkernel.

Predictor data, specified as an n-by-p numeric matrix, where n is the number of observations and p is the number of predictors. p must be equal to the number of predictors used to train Mdl.

Data Types: single | double

Response data, specified as an n-dimensional numeric vector. The length of Y must be equal to the number of observations in X or Tbl.

Data Types: single | double

Sample data used to train the model, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, Tbl can contain additional columns for the response variable and observation weights. Tbl must contain all the predictors used to train Mdl. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName or Y.

If you train Mdl using sample data contained in a table, then the input data for loss must also be in a table.

Response variable name, specified as the name of a variable in Tbl. The response variable must be a numeric vector. If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName.

If you specify ResponseVarName, then you must specify it as a character vector or string scalar. For example, if the response variable is stored as Tbl.Y, then specify ResponseVarName as 'Y'. Otherwise, the software treats all columns of Tbl, including Tbl.Y, as predictors.

Data Types: char | string

Name-Value Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: L = loss(Mdl,X,Y,'LossFun','epsiloninsensitive','Weights',weights) returns the weighted regression loss using the epsilon-insensitive loss function.

Loss function, specified as the comma-separated pair consisting of 'LossFun' and a built-in loss function name or a function handle.

• The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, $f\left(x\right)=T\left(x\right)\beta +b.$

• x is an observation (row vector) from p predictor variables.

• $T\left(·\right)$ is a transformation of an observation (row vector) for feature expansion. T(x) maps x in ${ℝ}^{p}$ to a high-dimensional space (${ℝ}^{m}$).

• β is a vector of m coefficients.

• b is the scalar bias.

ValueDescription
'epsiloninsensitive'Epsilon-insensitive loss: $\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,|y-f\left(x\right)|-\epsilon \right]$
'mse'MSE: $\ell \left[y,f\left(x\right)\right]={\left[y-f\left(x\right)\right]}^{2}$

'epsiloninsensitive' is appropriate for SVM learners only.

• Specify your own function by using function handle notation.

Let n be the number of observations in X. Your function must have this signature:

lossvalue = lossfun(Y,Yhat,W)

• The output argument lossvalue is a scalar.

• You choose the function name (lossfun).

• Y is an n-dimensional vector of observed responses. loss passes the input argument Y in for Y.

• Yhat is an n-dimensional vector of predicted responses, which is similar to the output of predict.

• W is an n-by-1 numeric vector of observation weights.

Data Types: char | string | function_handle

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a numeric vector or the name of a variable in Tbl.

• If Weights is a numeric vector, then the size of Weights must be equal to the number of rows in X or Tbl.

• If Weights is the name of a variable in Tbl, you must specify Weights as a character vector or string scalar. For example, if the weights are stored as Tbl.W, then specify Weights as 'W'. Otherwise, the software treats all columns of Tbl, including Tbl.W, as predictors.

If you supply the observation weights, loss computes the weighted regression loss, that is, the Weighted Mean Squared Error or Epsilon-Insensitive Loss Function.

loss normalizes Weights to sum to 1.

Data Types: double | single | char | string

Output Arguments

collapse all

Regression loss, returned as a numeric scalar. The interpretation of L depends on Weights and LossFun. For example, if you use the default observation weights and specify 'epsiloninsensitive' as the loss function, then L is the epsilon-insensitive loss.

collapse all

Weighted Mean Squared Error

The weighted mean squared error is calculated as follows:

$\text{mse}=\frac{\sum _{j=1}^{n}{w}_{j}{\left(f\left({x}_{j}\right)-{y}_{j}\right)}^{2}}{\sum _{j=1}^{n}{w}_{j}}\text{\hspace{0.17em}},$

where:

• n is the number of observations.

• xj is the jth observation (row of predictor data).

• yj is the observed response to xj.

• f(xj) is the response prediction of the Gaussian kernel regression model Mdl to xj.

• w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.

Epsilon-Insensitive Loss Function

The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. The function is formally described as:

$Los{s}_{\epsilon }=\left\{\begin{array}{c}0\text{\hspace{0.17em}},\text{\hspace{0.17em}}if\text{\hspace{0.17em}}|y-f\left(x\right)|\le \epsilon \\ |y-f\left(x\right)|-\epsilon \text{\hspace{0.17em}},\text{\hspace{0.17em}}otherwise.\end{array}$

The mean epsilon-insensitive loss is calculated as follows:

$Loss=\frac{\sum _{j=1}^{n}{w}_{j}\mathrm{max}\left(0,|{y}_{j}-f\left({x}_{j}\right)|-\epsilon \right)}{\sum _{j=1}^{n}{w}_{j}}\text{\hspace{0.17em}},$

where:

• n is the number of observations.

• xj is the jth observation (row of predictor data).

• yj is the observed response to xj.

• f(xj) is the response prediction of the Gaussian kernel regression model Mdl to xj.

• w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.