How to generate random numbers with constraints and different boundaries (for a high dimension)?

20 views (last 30 days)
I want to generate a set of random numbers with different boundaries and the sum of the random numbers generated is equal to a constant value.
For example, there are x1, x2, x3, ...., x40 and each variable has their own limit such as 50< x1 <100, 0 < x2 < 30, -100 < x3 < 50, ...., -30 < x40 < 30.
Also, x1+ x2 +x3 +...+ x4 = 5000.
I have tried using "randFixedLinearCombination" but it is impossible to generate such data for a high dimension vector (my problem actually has approximately 100 variables). Is there any function of algorithms to solve this problem?
Thank you

Accepted Answer

John D'Errico
John D'Errico on 8 Aug 2020
Neither of the choices on the FEX are directly a solution. randfixedsum fails because it explicitly assumes uniform bounds on the variables, and my own submission, randFixedLinearCombination, because the problem is simply too high of a dimension.
What are the issues here?
We can think of the constraint space as a rectangular box in n dimensions. But remember, N is pretty large here. At least it is in context of what you want to solve.
Suppose we were to transform the rectangular box to a unit hypercube in n-dimensions? That would seem to allow randfixedsum to work, but sadly, it does not. This is true because the transformation also transforms the sum, from a simple sum, into a general linear combination.
For example, consider the simple problem, where we want to generate sets of numbers in the box defined by constraints
0 <= x1 <= 1
-1 <= x2 <= 2
2 <= x3 <= 6
Where we want to see the sum constrained as
x1 + x2 + x3 == 4
The set of points that satisfies the problem can be represented as a polytope that lives in a plane.
In this case, the polytope is a quadrilateral. In some cases, the polytope could have been a triangle, or even a hexagon could have been possiible in some cases.
The request in general is to sample points randomly from such a set. In a high number of dimensions, the problem becomes much more complex of course, because the polytope may now be some massive polytope, embedded in an n-1 dimensional hyperplane. The approach I took in randFixedLinearCombination was to solve for that polytope, and then to sample randomly from it. The solution is theoretically correct, generating a uniform sample from the requested domain, although often practically impossible to achieve in a high number of dimensions because the polytope is itself sometimes too complicated to generate.
Can the problem be transformed into a unit hyper-cube? That is, suppose we transformed the problem to now have uniform bounds for ech variable? The problem is this transformation will now also implicitly transform the plane of interest. In the 3-d example above, we could set
y1 = x1
y2 = (x2+1)/3
y3 = (x3 - 2)/4
with the reverse transformation being
x1 = y1
x2 = 3*y2 - 1
x3 = 4*y3 + 2
However, now our constraint becomes
y1 + 3*y2 + 4*y3 == 3
This is now something that randfixedsum cannot handle, because the linear constraint is no longer a simple sum, even though each of y1,y2,y3 are assumed to lie in the unit cube.
The 40 or 100 dimensional problems mentioned by @Surat Asvapoositkul will have the same issues. One code could solve it, except that it is too large, while the other code cannot solve it because the bounds are not of uniformly the same widths.
Sorry about that.
  1 Comment
Surat Asvapoositkul
Surat Asvapoositkul on 8 Aug 2020
Thank you for your crystal clear explaination! The high dimensional vector makes it impossible to implement. I will try to find out of the number of variables in the dataset can be reduced or not.

Sign in to comment.

More Answers (2)

Walter Roberson
Walter Roberson on 7 Aug 2020
  3 Comments

Sign in to comment.


Jeff Miller
Jeff Miller on 9 Aug 2020
I don't know if this will be any use to you, but here is a way to generate "random" scores that meets some of your criteria (i.e., fixed total, all scores within bounds). The generated scores do not come close to having uniform distributions within their intervals, though, as the histograms at the end show. I don't think uniform distributions are possible with your other constraints...
% Bounded random numbers, fixed sum
% Any number of variables and any boundary values that you want
LowerBounds = [0, 10, 100, 250];
UpperBounds = [7, 33, 444, 750];
FixedSum = 900; % your choice
nVars = numel(LowerBounds);
Ranges = UpperBounds - LowerBounds;
TotalLowers = sum(LowerBounds);
TotalUppers = sum(UpperBounds);
nSims = 10000;
x = zeros(nVars,nSims);
for iSim=1:nSims
xRand = LowerBounds + rand(1,nVars) .* Ranges;
RandSum = sum(xRand);
if RandSum > FixedSum
TotalOvershoot = RandSum - FixedSum;
DecreaseAvailable = xRand - LowerBounds;
xRand = xRand - TotalOvershoot / sum(DecreaseAvailable) * DecreaseAvailable;
else
TotalUndershoot = FixedSum - RandSum;
IncreaseAvailable = UpperBounds - xRand;
xRand = xRand + TotalUndershoot / sum(IncreaseAvailable) * IncreaseAvailable;
end
%xRand is now one random vector with the fixed sum, and all scores are in their intervals.
%sum(xRand) % in case you want to check
x(:,iSim) = xRand;
end
figure;
for iVar=1:nVars
subplot(2,2,iVar)
histogram(x(iVar,:))
end

Categories

Find more on Thermodynamics and Heat Transfer in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!