Can't tell if exprnd is working correctly

5 views (last 30 days)
Umberto
Umberto on 1 Jul 2023
Edited: Umberto on 3 Jul 2023
I'm working on a project involving exponential/Erlang random variables, and I'm trying to validate a model via experiments using exprnd. I'm running into a problem where the results of my code depend in the number of samples I generate using exprnd at one time, no matter how long I run the simulation. I've explored every possibility, and in the end I started wondering whether something might be off with exprnd.
If my understanding is correct, the array exprnd(1/lambda,1,10000) and the array made of 100 copies of exprnd(1/lambda,1,100) should be statistically identical, but when I actually test this via code, there seems to be a difference (but I can't really tell for sure). I'm testing this by simply summing the two vectors. Here's the code I'm using. I'm plotting the difference between the running average of the two sums, which should be zero.
D1=0;
D2=0;
count=0;
lambda=24;
while 1
count=count+1;
d1=0;
for i=1:100
d1=d1+sum(exprnd(1/lambda,1,100));
end
d2=sum(exprnd(1/lambda,1,10000));
D1=D1+d1;
D2=D2+d2;
if rand<.0001
[D1-D2]/count
plot([0 count],[0 0],'r--');hold on;
scatter(count,[D1-D2]/count,'b');
pause(.001);
end
end
Here's a sample plot from the above code.
Right off the bat, it seems strange that the error is consistently negative; I would have thought it should be equally distributed around zero. Also, I can't tell whether or not the error is converging to zero.
Any insight on this would be appreciated.

Answers (2)

David Goodmanson
David Goodmanson on 1 Jul 2023
Edited: David Goodmanson on 2 Jul 2023
Hi Umberto,
I believe the issue happens because you are plotting the cumulative variables D1 and D2. Once the difference D1-D2 ends up significantly on one side or the other of zero, it takes awhile to switch back. On one run I had, D1-D2 stayed positive rather than negative for a long time until I got tired of waiting for something to happen.
If you change to plotting noncumulative variables
scatter(count,[d1-d2],'b');
you will see d1-d2 hopping around on both sides of the zero line.
  1 Comment
Umberto
Umberto on 2 Jul 2023
I think you might be right. Also, a t-test fails to reject the null hypothesis of a zero average at any decent confidence level. I think I should keep looking at my code for other errors then.

Sign in to comment.


Paul
Paul on 2 Jul 2023
Hi Umberto,
I modified the code so that the actual figure of merit that is computed is the difference between the running sample means by including a factor of 1e4 in the denominator.
I hypothesize that the effect you're seeing is just based on the draws. In the code below, for seed = 1, I'm going to guess that exprnd just happened to draw a large outlier (or two or three or ...) on the d1 and that outlier (or outliers?) biases D1 - D2 in favor of D1. That doesn't seem to happen with seed = 3, and maybe goes the other way with seed = 7, though not as much as with seed = 1.
lambda=24;
maxcount = 10000;
figure
hold on
for seed = [1 3 7]
D1=0;
D2=0;
count = 0;
rng(seed)
fom = zeros(maxcount,1);
while count < maxcount
count=count+1;
d1=0;
for i=1:100
d1=d1+sum(exprnd(1/lambda,1,100));
end
d2=sum(exprnd(1/lambda,1,10000));
D1=D1+d1;
D2=D2+d2;
fom(count) = (D1 - D2)/(count*1e4);
%{
if rand<1
[D1-D2]/count/10000;
plot([0 count],[0 0],'r--');hold on;
scatter(count,[D1-D2]/(count*1e4),'b');
%pause(.001);
end
%}
end
plot(fom,'DisplayName',"seed = " + num2str(seed)),grid
end
legend
ylim([-1 1]*1e-4)
Also, those differences are all 2-3 orders of magnitude smaller than expected value of distribution
1/lambda
ans = 0.0417

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!