Mahalanobis distance to Gaussian mixture component
Measure Mahalanobis Distance
Generate random variates that follow a mixture of two bivariate Gaussian distributions by using the
mvnrnd function. Fit a Gaussian mixture model (GMM) to the generated data by using the
fitgmdist function, and then compute Mahalanobis distances between the generated data and the mixture components of the fitted GMM.
Define the distribution parameters (means and covariances) of two bivariate Gaussian mixture components.
rng('default') % For reproducibility mu1 = [1 2]; % Mean of the 1st component sigma1 = [2 0; 0 .5]; % Covariance of the 1st component mu2 = [-3 -5]; % Mean of the 2nd component sigma2 = [1 0; 0 1]; % Covariance of the 2nd component
Generate an equal number of random variates from each component, and combine the two sets of random variates.
r1 = mvnrnd(mu1,sigma1,1000); r2 = mvnrnd(mu2,sigma2,1000); X = [r1; r2];
The combined data set
X contains random variates following a mixture of two bivariate Gaussian distributions.
Fit a two-component GMM to
gm = fitgmdist(X,2)
gm = Gaussian mixture distribution with 2 components in 2 dimensions Component 1: Mixing proportion: 0.500000 Mean: -2.9617 -4.9727 Component 2: Mixing proportion: 0.500000 Mean: 0.9539 2.0261
fitgmdist fits a GMM to
X using two mixture components. The means of
[0.9539,2.0261], which are close to
Compute the Mahalanobis distance of each point in
X to each component of
d2 = mahal(gm,X);
X by using
scatter and use marker color to visualize the Mahalanobis distance to
scatter(X(:,1),X(:,2),10,d2(:,1),'.') % Scatter plot with points of size 10 c = colorbar; ylabel(c,'Mahalanobis Distance to Component 1')
gm — Gaussian mixture distribution
Gaussian mixture distribution, also called Gaussian mixture model (GMM), specified as a
You can create a
gmdistribution object using
fitgmdist. Use the
gmdistribution function to create a
gmdistribution object by specifying the distribution parameters.
fitgmdist function to fit a
model to data given a fixed number of components.
X — Data
n-by-m numeric matrix
Data, specified as an n-by-m numeric matrix, where n is the number of observations and m is the number of variables in each observation.
If a row of
mahal excludes the row from the computation.
The corresponding value in
d2 — Squared Mahalanobis distance
n-by-k numeric matrix
Squared Mahalanobis distance of each observation in
X to each Gaussian
mixture component in
gm, returned as an
n-by-k numeric matrix, where
n is the number of observations in
k is the number of mixture components in
d2(i,j) is the squared distance of observation
i to the
jth Gaussian mixture component.
The Mahalanobis distance is a measure between a sample point and a distribution.
The Mahalanobis distance from a vector x to a distribution with mean μ and covariance Σ is
This distance represents how far x is from the mean in number of standard deviations.
mahal returns the squared Mahalanobis distance d2 from an observation in
X to a mixture
Introduced in R2007b