How to initialize an array such that the values satisfies an equation

11 views (last 30 days)
Hi
Is there a way to initialize α = {α1, . . . , αL} randomly subject to constraint
sum(y(i)*α(i)) = 0
. for all values of i
y takes values of either +1 or -1
Thanks
  2 Comments
John D'Errico
John D'Errico on 14 Nov 2019
Edited: John D'Errico on 14 Nov 2019
How many values are there? That is, what is the number L? Are we talking about 3 or so? Or Is this a problem with hundreds of such random numbers that will sum to zero after the associated sign changes? Your answer will very much influence the algorithm one would need to solve it.
As well, what underlying distribution do you want to see for the values in a? Uniform in the interval [0,1]? Normally distributed? What would it be?
Finally, must the sum be exactly zero? Or is this something where you only need to have an expected value of zero?
Again, these are all things that will influence any solution.
Nadia A
Nadia A on 14 Nov 2019
Sorry for missing out on the details. To answer to your questions:
  1. what is the number L? - There are 300 samples, L = 300
  2. Is this a problem with hundreds of such random numbers that will sum to zero after the associated sign changes?- Yes, sum of the product of corresponding values of alpha and y are zero
  3. Uniform in the interval [0,1]? - Yes, the alpha values are uniformly distributed between 0 and 1
  4. must the sum be exactly zero? - The values of alpha must be chosen such that the constraint holds

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 14 Nov 2019
Edited: the cyclist on 14 Nov 2019
I think if I were going to try to solve that problem, I would take as a starting point Roger Stafford's function randfixedsum, from the File Exchange. That function will generate a list of m values that sum to a specified value (e.g. 0), and each of which lie in a specified range (e.g. [0,1]).
The following algorith would obey all the constraints you mention:
  • Count how many times y == -1. Call that n.
  • Count how many times y == +1. Call that p.
  • Use the randfixedsum function to generate n values in the range [0,1] that sum to an arbitrary value V.
  • Use the randfixedsum function to generate p values in the range [0,1] that also sum to V.
  • Assign the n values to the locations with y == -1
  • Assign the p values to the locations with y == +1.
Then I believe the weighted sum will be zero. However, the resulting values may not obey all the restrictions on randomness that you need. For example, if
L = 2; % Instead of 300
y = [-1 1];
and you choose
V = 0.5;
then my algorithm above is simply going to output
alpha = [-0.5 0.5];
every time. A subtler version of that will happen if L = 300;
If the result is not meeting your requirement, you will need to more deeply understand Roger's algorithm, and try to adapt it. Regardless, I think his code is a great starting point.

More Answers (2)

JESUS DAVID ARIZA ROYETH
JESUS DAVID ARIZA ROYETH on 14 Nov 2019
I leave you a possible solution:
L=300;
y= -2*(rand(1,L)>0.5)+1;%put your y here
alpha=rand(1,L-1);
alpha(end+1)=-sum(alpha.*y(1:L-1))/y(L);
disp(alpha)
disp(sum(alpha.*y))
  2 Comments
the cyclist
the cyclist on 14 Nov 2019
Edited: the cyclist on 14 Nov 2019
The last element of the alpha vector here is not guaranteed to be within the range (0,1), and I also expect that it has different statistical properties from the other elements of alpha.
Nadia A
Nadia A on 14 Nov 2019
Edited: Nadia A on 14 Nov 2019
Thank you for your answer. I am not sure whether the last alpha value will always be between 0 and 1.

Sign in to comment.


John D'Errico
John D'Errico on 15 Nov 2019
Edited: John D'Errico on 15 Nov 2019
Too bad, in the sense that for smaller L, say on the order of 8 or so, this is not that difficult to do exactly. (I have written code to solve exactly this problem, so I know that to be true.) For a higher number of dimensions, here 300, it gets nasty to do for uniformly distributed samples. Or, if the distribution of the numbers was intended to be normally distributed, it would be again pretty easy.
I might suggest one simple solution, that won't be truly correct in terms of the distribution of the numbers, but it won't be terrible. And it will ensure positively the result will be in the interval [0,1]. You don't even need to use randfixedsum.
L = 300;
% Just picking some arbitrary vector y.
y = (rand(1,L) > 0.5)*2 - 1;
% identify the positve and negative y elements
yp = y > 0;
yn = ~yp;
% generate a fully random set for a.
a = rand(1,L);
% compute the actual sum.
S = a*y';
% Is S is positve or negative?
if S > 0
% we now scale the elements of a down for just
% the positive elements of y, by just enough to
% make the sum exactly zero.
sumyp = sum(a(yp));
a(yp) = a(yp)*(1 - S/sumyp);
else
% S must be negative
% we now scale the elements of a down for just
% the negative elements of y, by just enough to
% make the sum exactly zero.
sumyn = sum(a(yn));
a(yn) = a(yn)*(1 + S/sumyn);
end
The necessary bias will generally not be huge. Did it work?
Well, yes. The requisite sum will be identically zero, except for floating point trash.
a*y'
ans =
4.4409e-16
min(a)
ans =
0.00023107
max(a)
ans =
0.98402
The vector a lies always in the interval [0,1]. No exceptions, ever. And it is quite efficient.
The only downside is the necessary (usually tiny) bias applied to roughly half of the elements will make the result not perfectly uniformly distributed. It is not perfect, but possibly acceptable.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!