exampleKS - Multiple sample test for data from same distribution

2014-09-22  Matlab2014  W.Whiten

Compare multiple sample distributions using a statistic of the maximum difference in probability of cumulative distributions. An extension of the Kolomogorov-Smirnov test to compare distributions. Probability distribution is generated by simulation.

Expect a delay of about one second while distribution is generated.

Enter data, plot and calculate statistic

% set sizes
m=20;  % number of samples in distribution
n=6;   % number of distributions
r=10000;  % number of repeats for generating probability distribution

% generate random sample for demonstration - put your data here
x=rand(m,n);

% plot data
y=(0:m-1)'/(m-1); % distribution from 0 to 1
figure
plot(sort(x),y)
xlabel('Data values')
ylabel('Probability')
title('Cumulative distributions of data')

% and statistic for this sample
s=multiKS(x);
disp(' ')
disp(['Statistic for this sample  ',num2str(s)])

Statistic for this sample  0.42911 Generate distribution for this statistic and plot

% probability and sorted distribution samples for the statisitic
disp(' ')
tic;[ps,d]=probKS(x);toc

disp(' ')
disp('Probability of statistic being that large ')
disp([' (upper tail) due to random variation only  ',num2str(ps)])

Elapsed time is 0.939580 seconds.

Probability of statistic being that large
(upper tail) due to random variation only  0.2411

Plot distribution and statistic value

y=(0:r-1)'/(r-1);
figure
plot(d,y)
xlabel('Statistic value')
ylabel('Probability')
title('Position of data statistic on random cumulative distribution')
hold on
plot(s,1-ps,'r*')
legend('Distribution','Sample','Location','NorthWest') 