Clear Filters
Clear Filters

Unexpected interquartile range (IQR) result

8 views (last 30 days)
Sim
Sim on 9 Dec 2023
Commented: Sim on 11 Dec 2023
For a number of distributions I would like to compare and show the interquartile range (IQR) and the standard deviation (STD).
For the normal distribution I got more or less what expected, i.e. the percentage of data within 1 STD, is around 68% of the distribution, and the IQR is around 50% of the distribution (i.e. the central half of the distribution). Here following my test:
clear all; clc;
samplesize = 100000;
% generate distribution
mu = 0;
sigma = 1;
data = normrnd(mu,repmat(sigma,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 68.1040
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50.1370
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off
However, if I try the same with another distribution, like a gamma one, the IQR is not 50% anymore of the distribution. What did I do wrong?
clear all; clc;
samplesize = 100000;
% generate distribution
a = 1;
b = 5;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.5350
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 100
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off

Answers (1)

Sim
Sim on 9 Dec 2023
Edited: Sim on 9 Dec 2023
my bad.. this is the solution:
dataIQR = data( data > q(1) & data < q(3) );
and the vertical lines related to the quartiles need to be replaced by this command:
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
This is a correct example:
% generate distribution
samplesize = 100000;
a = 1;
b = 8;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.3970
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data( data > q(1) & data < q(3) );
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50
% plot
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
  2 Comments
Steven Lord
Steven Lord on 9 Dec 2023
You could check your results using the iqr function and/or the prctile function, each moved from Statistics and Machine Learning Toolbox to MATLAB in release R2022a.
Sim
Sim on 11 Dec 2023
Thanks a lot @Steven Lord for your nice comment and suggestion! :-) :-)

Sign in to comment.

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!