Statistics and Machine Learning Toolbox


Version 11.5, part of Release 2019a, includes the following enhancements:

  • Machine Learning Apps: Train Naïve Bayes models in Classification Learner, and export visualizations to figures in Classification Learner and Regression Learner
  • Machine Learning Algorithms: Perform density-based spatial clustering of applications with noise (DBSCAN), hyperparameter optimization for multiclass classification with kernels, and accelerated training gradient boosted trees (similar to XGBoost)
  • Code Generation: Generate C/C++ code for kernel density estimation and for prediction using naïve Bayes models
  • Deployed Model Updating: Update deployed multiclass SVM models in-place without regenerating code
  • Big Data Algorithms: Use cost, prior, and weights with TreeBagger, perform regression using decision trees, and automate hyperparameter optimization for binary regression trees and multiclass classification on tall arrays

See the Release Notes for details.

Version 11.4, part of Release 2018b, includes the following enhancements:

  • Big Data Algorithms: Fit multiclass classification models, perform hyperparameter optimization, specify cost and priors when fitting classification models, compute approximate quantiles, and expand categorical variables into dummy variables on outof- memory data
  • Code Generation: Update a deployed SVM model without regenerating code (requires MATLAB Coder)
  • Nonlinear (Kernel) Classification and Regression: Perform hyperparameter optimization and cross-validation when fitting models using fitckernel and fitrkernel
  • Multiclass Nonlinear Classification: Perform multiclass learning for nonlinear kernel classification by using fitcecoc
  • Visualization: Display a confusion matrix using confusionchart

See the Release Notes for details.

Version 11.3, part of Release 2018a, includes the following enhancements:

  • Code Generation: Generate C code for distance calculation on vectors and matrices, and for prediction by using k-nearest neighbor with Kd-tree search and nontree ensemble models (requires MATLAB Coder)
  • Nonlinear Regression for Big Data: Fit kernel SVM regression models by using random feature expansion
  • Big Data Algorithms: Compute confusion matrices and create nonstratified partitions for cross-validation on out-of-memory data​
  • Classification Learner App: Visualize and investigate high-density data with improved scatter plot​​s

See the Release Notes for details.

Version 11.2, part of Release 2017b, includes the following enhancements:

  • Code Generation: Generate C code for prediction by using discriminant analysis, k-nearest neighbor, SVM regression, regression tree ensemble, and Gaussian process regression models (requires MATLAB Coder)
  • Big Data Algorithms: Fit kernel SVM classification models by using random feature expansion, fit linear SVM regression models, grow decision trees, and draw weighted random samples from out-of-memory data
  • Parallel Bayesian Optimization: Tune hyperparameters faster by using parallel function evaluation (requires Parallel Computing Toolbox)
  • Machine Learning Apps: Select training data more efficiently in the Classification Learner and Regression Learner Apps​
  • Partial Dependence Plots: Visualize relationships between features and predicted responses through marginalization

See the Release Notes for details.

Version 11.1, part of Release 2017a, includes the following enhancements:

  • Regression Learner App: Train regression models using supervised machine learning
  • Big Data Algorithms: Perform support vector machine (SVM) and Naive Bayes classification, create bags of decision trees, and fit lasso regression on out-of-memory data
  • Code Generation: Generate C code for prediction by using linear models, generalized linear models, decision trees, and ensembles of classification trees (requires MATLAB Coder)
  • Bayesian Statistics: Perform gradient-based sampling using Hamiltonian Monte Carlo (HMC) sampler
  • Feature Extraction: Perform unsupervised feature learning by using sparse filtering and reconstruction independent component analysis (RICA)

See the Release Notes for details.

Version 11.0, part of Release 2016b, includes the following enhancements:

  • Big Data Algorithms: Perform dimension reduction, descriptive statistics, k-means clustering, linear regression, logistic regression, and discriminant analysis on out-of-memory data
  • Bayesian Optimization: Tune machine learning algorithms by searching for optimal hyperparameters
  • Feature Selection: Use neighborhood component analysis (NCA) to choose features for machine learning models
  • Code Generation: Generate C code for prediction by using SVM and logistic regression models (requires MATLAB Coder)
  • Classification Learner: Train classifiers in parallel (requires Parallel Computing Toolbox)
  • Machine Learning Performance: Speed up Gaussian mixture modeling, SVM with duplicate observations, and distance calculations for sparse data
  • Survival Analysis: Fit Cox proportional hazards models with new options for residuals and handling ties

See the Release Notes for details.

Version 10.2, part of Release 2016a, includes the following enhancements:

  • Machine Learning for High-Dimensional Data: Perform fast fitting of linear classification and regression models with techniques such as stochastic gradient descent and (L)BFGS using fitclinear and fitrlinear functions
  • Classification Learner: Train multiple models automatically, visualize results by class labels, and perform logistic regression classification
  • Performance: Perform clustering using kmeans, kmedoids, and Gaussian mixture models faster when data has a large number of clusters
  • Probability Distributions: Fit kernel smoothing density to multivariate data using the ksdensity and mvksdensity functions
  • Stable Distributions: Model financial and other data that requires heavy-tailed distributions

See the Release Notes for details.