plotting data to get mahalanobis and eucledian distance

12 views (last 30 days)
I get example from book to calculate mahalanobis and eucledian distance as shown below, but the book doesn't provide ilustration, so i trying to make plotting data point of gaussian distribution of two class based on its mean and covariance value . But I am not convident with the result that i got. Since the data that i generate from information quite complicated, and end up confusing me with the result. The result says data point x classifies in first class based on euclidean distance and classified in second class based on mahalanobis distance.
Here the example of the book
% 1. Utilize function euclidean_classifier by typing
x=[0.1 0.5 0.1]';
m1=[0 0 0]';
m2=[0.5 0.5 0.5]';
m=[m1 m2];
z_euc=euclidean_classifier(m,x)
% 2. Use function mahalanobis_classifier
x=[0.1 0.5 0.1]';
m1=[0 0 0]';
m2=[0.5 0.5 0.5]';
m=[m1 m2];z
S=[0.8 0.01 0.01;
0.01 0.2 0.01;
0.01 0.01 0.2];
z_mahal=mahalanobis_classifier(m,S,x)
Here my attemp to ilustrate the distribution of point in 3D
me =m1';
s = 1;
pdf1 = bsxfun(@plus,me,s.*randn(500,3));
plot3(pdf1(:,1),pdf1(:,2),pdf1(:,3),'.r')
M = mean(pdf1,1)
S = std(pdf1,1)
hold on
pdf2 = bsxfun(@plus,m2',s.*randn(500,3));
plot3(pdf2 (:,1),pdf2 (:,2),pdf2 (:,3),'.b')
hold on
plot3(x(1),x(2),x(3),'og',MarkerFaceColor='g')

Answers (1)

Milan Bansal
Milan Bansal on 10 Sep 2024
Hi nirwana
Classification using Euclidean distance and Mahalanobis distance can indeed yield different results. Here's why:
Euclidean Distance:
  • Straight-line distance: Euclidean distance measures the straight-line distance between two points in a multi-dimensional space. It treats all dimensions equally, without considering the variability or correlation in the data.
  • Uniform distribution assumption: It assumes that the data points are equally spread in all directions, essentially assuming a spherical distribution around the mean.
Mahalanobis Distance:
  • Distance with covariance: Mahalanobis distance, on the other hand, accounts for the variability in each direction (captured by the covariance matrix). This means that the direction and shape of the distribution are considered when calculating the distance.
  • Elliptical distribution: Mahalanobis distance assumes that the data points form an ellipsoid rather than a sphere. The covariance matrix defines how "stretched" or "squeezed" the data is in different directions.
Example:
Imagine you have two classes, one with a wide spread (like a pancake) and the other with a narrow spread (like a ball). If a point lies relatively close to the wide-spread class in terms of Euclidean distance, Euclidean distance might classify it into that class. But because Mahalanobis distance considers the spread, it might classify the point into the narrow class, since the point is relatively far from the center of the pancake-shaped class in terms of their respective distributions.
You can calculate the Euclidean and Mahalanobis distance using the pdist2 function in MATLAB. Refer to the documentation for more information : https://www.mathworks.com/help/stats/pdist2.html#d126e834198
Hope this helps!

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!