Train Regression Models in Regression Learner App
You can use Regression Learner to train regression models including linear regression models, regression trees, Gaussian process regression models, support vector machines, kernel approximation, ensembles of regression trees, and neural network regression models. In addition to training models, you can explore your data, select features, specify validation schemes, and evaluate results. You can export a model to the workspace to use the model with new data or generate MATLAB® code to learn about programmatic regression.
Training a model in Regression Learner consists of two parts:
Validated Model: Train a model with a validation scheme. By default, the app protects against overfitting by applying cross-validation. Alternatively, you can choose holdout validation. The validated model is visible in the app.
Full Model: Train a model on full data, excluding test data. The app trains this model simultaneously with the validated model. However, the model trained on full data is not visible in the app. When you choose a regression model to export to the workspace, Regression Learner exports the full model.
The app does not use test data for model training. Models exported from the app are trained on the full data, excluding any data reserved for testing.
The app displays the results of the validated model. Diagnostic measures, such as model accuracy, and plots, such as a response plot or residuals plot, reflect the validated model results. You can automatically train one or more regression models, compare validation results, and choose the best model that works for your regression problem. When you choose a model to export to the workspace, Regression Learner exports the full model. Because Regression Learner creates a model object of the full model during training, you experience no lag time when you export the model. You can use the exported model to make predictions on new data.
Automated Regression Model Training
You can use Regression Learner to automatically train a selection of different regression models on your data.
Get started by automatically training multiple models simultaneously. You can quickly try a selection of models, and then explore promising models interactively.
If you already know what model type you want, then you can train individual models instead. See Manual Regression Model Training.
On the Apps tab, in the Machine Learning and Deep Learning group, click Regression Learner.
In the Regression Learner app, on the Regression Learner tab, in the File section, click New Session and select data from the workspace or from a file. Specify a response variable and variables to use as predictors. Alternatively, click Open to open a previously saved app session. See Select Data for Regression or Open Saved App Session.
In the Models section, click the arrow to expand the list of regression models. Select All Quick-To-Train. This option trains all the model presets that are fast to fit.
In the Train section, click Train All and select Train All.
If you have Parallel Computing Toolbox™, the app trains the models in parallel by default. See Parallel Regression Model Training.
A selection of model types appears in the Models pane. When the models finish training, the best RMSE (Validation) score is outlined in a box.
Click models in the Models pane and open the corresponding plots to explore the results.
To try all the nonoptimizable model presets available, click All in the Models section of the Regression Learner tab. Then, in the Train section, click Train All and select Train Selected.
Manual Regression Model Training
To explore individual model types, you can train models one at a time or as a group.
Choose a model type. On the Regression Learner tab, in the Models section, click a model type. To see all available model options, click the arrow in the Models section to expand the list of regression models. The nonoptimizable model options in the gallery are preset starting points with different settings, suitable for a range of different regression problems.
To read descriptions of the models, switch to the details view.
For more information on each option, see Choose Regression Model Options.
After selecting a model, you can train the model. In the Train section, click Train All and select Train Selected. Repeat the process to explore different models.
Alternatively, you can create several draft models and then train the models as a group. In the Train section, click Train All and select Train All.
Select regression trees first. If your trained models do not predict the response accurately enough, then try other models with higher flexibility. To avoid overfitting, look for a less flexible model that provides sufficient accuracy.
If you want to try all nonoptimizable models of the same or different types, then select one of the All options in the Models gallery.
Alternatively, if you want to automatically tune hyperparameters of a specific model type, select the corresponding Optimizable model and perform hyperparameter optimization. For more information, see Hyperparameter Optimization in Regression Learner App.
For next steps, see Compare and Improve Regression Models.
Parallel Regression Model Training
You can train models in parallel using Regression Learner if you have Parallel Computing Toolbox. Parallel training allows you to train multiple models simultaneously and continue working.
To control parallel training, toggle the Use Parallel button in the Train section of the Regression Learner tab. To train draft models in parallel, ensure the button is toggled on before clicking the Train All button. The Use Parallel button is available only if you have Parallel Computing Toolbox.
The Use Parallel button is on by default. The first time you click Train All and select Train All or Train Selected, a dialog box is displayed while the app opens a parallel pool of workers. After the pool opens, you can train multiple models at once.
When models are training in parallel, progress indicators appear on each training and queued model in the Models pane. If you want, you can cancel individual models. During training, you can examine results and plots from models, and initiate training of more models.
If you have Parallel Computing Toolbox, then parallel training is available for nonoptimizable models in
Regression Learner, and you do not need to set the
option of the
Even if you do not have Parallel Computing Toolbox, you can keep the app responsive during model training. Before training draft models, on the Regression Learner tab, in the Train section, click Train All and ensure the Use Background Training check box is selected. Then, select the Train All option. A dialog box is displayed while the app opens a background pool. After the pool opens, you can continue to interact with the app while models train in the background.
Compare and Improve Regression Models
Examine the RMSE (Validation) score reported in the Models pane for each model. Click models in the Models pane and open the corresponding plots to explore the results. Compare model performance by inspecting results in the plots. You can rearrange the layout of the plots to compare results across multiple models: use the options in the Layout button, drag and drop plots, or select the options provided by the Document Actions arrow located to the right of the model plot tabs.
Additionally, you can compare the models by using the Sort by options in the Models pane. Delete any unwanted model by selecting the model and clicking the Delete selected model button in the upper right of the pane, clicking the Delete button in the Models section of the Regression Learner tab, or right-clicking the model and selecting Delete.
Select the best model in the Models pane and then try including and excluding different features in the model.
First, create a copy of the model. After selecting the model, either click the Duplicate button in the Models section of the Regression Learner tab or right-click the model and select Duplicate.
Then, click Feature Selection in the Options section of the Regression Learner tab. Use the available feature ranking algorithms to select features.
Try the response plot to help you identify features to remove. See if you can improve the model by removing features with low predictive power. Specify predictors to include in the model, and train new models using the new options. Compare results among the models in the Models pane.
You also can try transforming features with PCA to reduce dimensionality. Click PCA in the Options section of the Regression Learner tab.
To try to improve the model further, you can duplicate it, change the hyperparameter options in the Model Hyperparameters section of the model Summary tab, and then train the model using the new options. To learn how to control model flexibility, see Choose Regression Model Options. For information on how to tune model hyperparameters automatically, see Hyperparameter Optimization in Regression Learner App.
If feature selection, PCA, or new hyperparameter values improve your model, try training All model types with the new settings. See if another model type does better with the new settings.
To avoid overfitting, look for a less flexible model that provides sufficient accuracy. For example, look for simple models, such as regression trees that are fast and easy to interpret. If your models are not accurate enough, then try other models with higher flexibility, such as ensembles. To learn about the model flexibility, see Choose Regression Model Options.
This figure shows the app with a Models pane containing various regression model types.
For a step-by-step example comparing different regression models, see Train Regression Trees Using Regression Learner App.
Next, you can generate code to train the model with different data or export trained models to the workspace to make predictions using new data. See Export Regression Model to Predict New Data.