Main Content

Choose a Regression Function

Regression is the process of fitting models to data. The models must have numerical responses. For models with categorical responses, see Parametric Classification or Supervised Learning Workflow and Algorithms. The regression process depends on the model. If a model is parametric, regression estimates the parameters from the data. If a model is linear in the parameters, estimation is based on methods from linear algebra that minimize the norm of a residual vector. If a model is nonlinear in the parameters, estimation is based on search methods from optimization that minimize the norm of a residual vector.

This table describes which function to use depending on the type of regression problem.

Model ComponentsResult of RegressionFunction to Use
Continuous or categorical predictors, continuous response, linear modelFitted model coefficientsfitlm. See Linear Regression.
Continuous or categorical predictors, continuous response, linear model of unknown complexityFitted model and fitted coefficientsstepwiselm. See Stepwise Regression.
Continuous or categorical predictors, response possibly with restrictions such as nonnegative or integer-valued, generalized linear modelFitted generalized linear model coefficientsfitglm or stepwiseglm. See Generalized Linear Models.
Continuous predictors with a continuous nonlinear response, parametrized nonlinear modelFitted nonlinear model coefficientsfitnlm. See Nonlinear Regression.
Continuous predictors, continuous response, linear modelSet of models from ridge, lasso, or elastic net regressionlasso or ridge. See Lasso and Elastic Net or Ridge Regression.
Correlated continuous predictors, continuous response, linear modelFitted model and fitted coefficientsplsregress. See Partial Least Squares.
Continuous or categorical predictors, continuous response, unknown modelNonparametric modelfitrtree or fitrensemble.
Categorical predictors onlyANOVAanova, anova1, anova2, anovan.
Continuous predictors, multivariable response, linear modelFitted multivariate regression model coefficientsmvregress
Continuous predictors, continuous response, mixed-effects modelFitted mixed-effects model coefficientsnlmefit or nlmefitsa. See Mixed-Effects Models.

Update Legacy Code with New Fitting Methods

There are several Statistics and Machine Learning Toolbox™ functions for performing regression. The following sections describe how to replace calls to older functions to new versions:

regress into fitlm

Previous Syntax:

[b,bint,r,rint,stats] = regress(y,X)

where X contains a column of ones.

Current Syntax:

mdl = fitlm(X,y)

where you do not add a column of ones to X.

Equivalent values of the previous outputs:

  • bmdl.Coefficients.Estimate

  • bintcoefCI(mdl)

  • rmdl.Residuals.Raw

  • rint — There is no exact equivalent. Try examining mdl.Residuals.Studentized to find outliers.

  • statsmdl contains various properties that replace components of stats.

regstats into fitlm

Previous Syntax:

stats = regstats(y,X,model,whichstats)

Current Syntax:

mdl = fitlm(X,y,model)

Obtain statistics from the properties and methods of the LinearModel object (mdl). For example, see the mdl.Diagnostics and mdl.Residuals properties.

robustfit into fitlm

Previous Syntax:

[b,stats] = robustfit(X,y,wfun,tune,const)

Current Syntax:

mdl = fitlm(X,y,'robust','on') % bisquare

Or to use the wfun weight and the tune tuning parameter:

opt.RobustWgtFun = 'wfun';
opt.Tune = tune; % optional
mdl = fitlm(X,y,'robust',opt)

Obtain statistics from the properties and methods of the LinearModel object (mdl). For example, see the mdl.Diagnostics and mdl.Residuals properties.

stepwisefit into stepwiselm

Previous Syntax:

[b,se,pval,inmodel,stats,nextstep,history] = stepwisefit(X,y,Name,Value)

Current Syntax:

mdl = stepwiselm(ds,modelspec,Name,Value)

or

mdl = stepwiselm(X,y,modelspec,Name,Value)

Obtain statistics from the properties and methods of the LinearModel object (mdl). For example, see the mdl.Diagnostics and mdl.Residuals properties.

glmfit into fitglm

Previous Syntax:

[b,dev,stats] = glmfit(X,y,distr,param1,val1,...)

Current Syntax:

mdl = fitglm(X,y,distr,...)

Obtain statistics from the properties and methods of the GeneralizedLinearModel object (mdl). For example, the deviance is mdl.Deviance, and to compare mdl against a constant model, use devianceTest(mdl).

nlinfit into fitnlm

Previous Syntax:

[beta,r,J,COVB,mse] = nlinfit(X,y,fun,beta0,options)

Current Syntax:

mdl = fitnlm(X,y,fun,beta0,'Options',options)

Equivalent values of the previous outputs:

  • betamdl.Coefficients.Estimate

  • rmdl.Residuals.Raw

  • covbmdl.CoefficientCovariance

  • msemdl.mse

mdl does not provide the Jacobian (J) output. The primary purpose of J was to pass it into nlparci or nlpredci to obtain confidence intervals for the estimated coefficients (parameters) or predictions. Obtain those confidence intervals as:

parci = coefCI(mdl)
[pred,predci] = predict(mdl)