how to generate random numbers with constraints?
Show older comments
Hi, I want to generate 24 random numbers whose sum is limited between [3550 3650]. in addition each number must be between [5 800].
here is my try:
ub=800*ones(1,24);
lb=5*ones(1,24);
a=lb+rand(1,24).*(ub-lb);
while sum(a)>3650 || sum(a)<3550
a=lb+rand(1,24).*(ub-lb);
end
but it stucks in the loop forever. please help me...
6 Comments
Dennis
on 29 Apr 2019
I think the odds are simply not in your favor, you are looking for 24 rather small numbers if you want their sum to not exceed 3650. If you let your loop run 'forever' you might eventually find a suitable vector.
You could manipulate your vector and replace high values specifically, this will get you a result reasonable fast:
ub=800*ones(1,24);
lb=5*ones(1,24);
a=lb+rand(1,24).*(ub-lb);
c=0;
while sum(a)>3650 || sum(a)<3550
[~,idx]=max(a);
if sum(a)<3550 %in the unlikely case....
[~,idx]=min(a);
end
a(idx)=5+rand(1)*795;
c=c+1; %counts the number of iterations, not needed for anything
end
alex brown
on 29 Apr 2019
John D'Errico
on 29 Apr 2019
Edited: John D'Errico
on 29 Apr 2019
Don't forget that in 24 dimensions, it will be difficult to determine if a set is truly uniform. For example, consider a very simple example. I'll sample a set of numbers in 2-dimensions, such that the sum is strictly less than 1.
xy = rand(10000,2);
xy(sum(xy,2)>1,:) = [];
plot(xy(:,1),xy(:,2),'.')

that is clearly uniform, as you would expect. I just used rand, and then threw away anything that exceeds the sum constraint. Such a rejection scheme is perfectly correct. But now look at one of the marginal distributions.
hist(xy(:,2),100)

So, in fact, while the numbers are uniformly distributed, subject to the constraint that the sum does not exceed 1, if you look at any marginal of that set, it needs not be uniform.
It is very easy to make a mistake in these things, to misinterpret what you see. The points are still distributed perfectly uniformly in TWO dimensions. But the marginal distributions are what they are.
alex brown
on 29 Apr 2019
John D'Errico
on 29 Apr 2019
The data that I generated IS uniform. It is uniform over the domain of the triangular region where the data lives. It is uniform over the set that satifies the sum constraint.
The marginal distribution has a triangular PDF. It is NOT a gamma. There is no way that you can have data that is BOTH uniform over the constraint domain, as well as being uniform in any of the individual variables.
If you are looking for data that is truly uniform in each of the variables, yet has a sum that falls within some bounds? You can't have that.
Consider the example I gave. I chose two variables, x and y, uniform over the interval [0,1]. Now, look at the sum x+y. Can the variables be truly independently uniform over the domain, yet have a sum that is less than 1? So I'm not sure what you are asking to produce anymore. What is the final goal here?
alex brown
on 30 Apr 2019
Accepted Answer
More Answers (1)
The reason for your infinite loop is that you need an average random value of 150, which is far from the expected value of 402.5. That means you need quite an extreme situation to land on your required vector.
You can use randfixedsum from the FEX. This function takes a more mathematical approach, which means that it doesn't need to reject vectors that don't satisfy the constraints.
%output matrix size
n=24;m=1;
%generate a random sum between [3550 3650]
s_bounds=[3550 3650];s=rand*diff(s_bounds)+s_bounds(1);
%bounds of values themselves
a=5;b=800;
%generate the values
x = randfixedsum(n,m,s,a,b);
%transpose to move from 24x1 to 1x24
x=x';
6 Comments
John D'Errico
on 29 Apr 2019
This is the coorrect answer, in the case that the sum would be fixed. However, since the sum must live in a fixed range, it fails. But it is close.
Rik
on 29 Apr 2019
I thought I would have covered it by randomizing the fixed sum, but I suspect you'll address that point somewhere in your answer, so I'll wait patiently.
Bruno Luong
on 29 Apr 2019
+1. Perhaps this solution is good enough
John D'Errico
on 29 Apr 2019
Edited: John D'Errico
on 29 Apr 2019
Oops! You did address the random sum issue, but you did it uniformly over the desired limits. So this is not that bad a solution in fact. The effective difference between your solution and mine was that you generated a random sum uniformly over that interval, whereas I used a truncated normal, with normal parameters based on the sum. Its a subtle difference, and one that even I could live with easily enough. So +1 from me too.
Is your solution far off?
normpdf(sumLimits,sumMean,sumStd)
ans =
1.37044595803922e-10 2.21343945315968e-10
So, the odds of a sum at the top end are almost twice as high as a sum at the bottom end of the interval. That is pretty close in context.
Rik
on 29 Apr 2019
Thank you for clarifying. Statistics aren't my strong suit, so questions like this reach my limit.
alex brown
on 29 Apr 2019
Categories
Find more on Random Number Generation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!