Statistics and Machine Learning Toolbox

Analyze and model data using statistics and machine learning

Statistics and Machine Learning Toolbox™ provides functions and apps to describe, analyze, and model data. You can use descriptive statistics and plots for exploratory data analysis, fit probability distributions to data, generate random numbers for Monte Carlo simulations, and perform hypothesis tests. Regression and classification algorithms let you draw inferences from data and build predictive models.

For multidimensional data analysis, Statistics and Machine Learning Toolbox provides feature selection, stepwise regression, principal component analysis (PCA), regularization, and other dimensionality reduction methods that let you identify variables or features that impact your model.

The toolbox provides supervised and unsupervised machine learning algorithms, including support vector machines (SVMs), boosted and bagged decision trees, k-nearest neighbor, k-means, k-medoids, hierarchical clustering, Gaussian mixture models, and hidden Markov models. Many of the statistics and machine learning algorithms can be used for computations on data sets that are too big to be stored in memory.

Get Started

Learn the basics of Statistics and Machine Learning Toolbox

Descriptive Statistics and Visualization

Data import and export, descriptive statistics, visualization

Probability Distributions

Data frequency models, random sample generation, parameter estimation

Hypothesis Tests

t-test, F-test, chi-square goodness-of-fit test, and more

Cluster Analysis

Unsupervised learning techniques to find natural groupings and patterns in data


Analysis of variance and covariance, multivariate ANOVA, repeated measures ANOVA


Linear, generalized linear, nonlinear, and nonparametric techniques for supervised learning


Supervised learning algorithms for binary and multiclass problems

Dimensionality Reduction and Feature Extraction

PCA, factor analysis, feature selection, feature extraction, and more

Industrial Statistics

Design of experiments (DOE); survival and reliability analysis; statistical process control

Analysis of Big Data with Tall Arrays

Analyze out-of-memory data

Speed Up Statistical Computations

Parallel or distributed computation of statistical functions

Code Generation

Generate C/C++ code and MEX functions for Statistics and Machine Learning Toolbox functions