How to generate a Gaussian random complex vector of N elements

Question

Sadiq Akbar on 29 May 2021

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/842690-how-to-generate-a-gaussian-random-complex-vector-of-n-elements

Edited: Paul on 2 Jun 2021

How can we generate a Gaussian random complex vector of N elements of zero mean and variance 1 in MATLAB ?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Chunru on 29 May 2021

2
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/842690-how-to-generate-a-gaussian-random-complex-vector-of-n-elements#answer_711840

% Assume that real and imaginary parts are independent
% 1/sqrt(2) makes the variaonce to be 1
x = 1/sqrt(2)*(rand(N, 1) +1i*rand(N,1));

9 Comments
Show 7 older commentsHide 7 older comments

Walter Roberson on 29 May 2021

Edited: Walter Roberson on 29 May 2021

Remember that Gaussian functions have statistical means and variances. If you require that the mean and variance is exactly 0 and 1, then when you known N-1 of the values, you constrain the final value to be whatever is needed to make the mean and vaiance what you want: N-1 degrees of freedom rather than N degrees of freedom.

For example if you were to generate with N = 1, then if you require that mean for the generated sample by itself is exactly 0, then that could only happen if the generated value was exactly 0.... in which case the standard deviation would have to be 0, not 1, because you would have constrained to a constant (no deviation)

If you were to generate with N = 2, then if you require that mean for the generated samples together to be exactly 0, then that could only happen if the generated values were x and -x for some scalar x, so that their sum was 0. The variance requirement would further pin this down so that x = +/- 1/sqrt(2) and the other sample would be the negative of that. Mean would be 0, (x-mean)^2 would be (1/sqrt(2)-0)^2 which would be 1/2. That would apply for each of the two entries, so the sum between them would be 2/2 = 1. That would be the variance. So there would be effectively no actual choice, other than whether the positive or negative value was first.

Paul on 29 May 2021

I belleve that the 1/sqrt(2) scaling is correct, based on the definition of the variance of a complex random variable Z:

Var(Z) = Var(Re(Z)) + Var(Im(Z))

The scaling is not unique, i.e., if N1 and N2 are both real standard normal, then Z = a1*N1 + i*a2*N2 will have Var(Z) = 1 for any real a1 and a2 that satisfy a1^2 + a2^2 = 1, because Var(a1*N1) = a1^2*Var(N1) = a1^2, and similarly for Var(a2*N2).

Walter Roberson on 29 May 2021

I made a mistake in my testing earlier: I now agree with 1/sqrt(2) as the scaling.

Sign in to comment.

Answer 2

Walter Roberson on 29 May 2021

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/842690-how-to-generate-a-gaussian-random-complex-vector-of-n-elements#answer_711890

Edited: Walter Roberson on 29 May 2021

There are two approaches to ensuring a sample mean of exactly 0, and a sample variance of exactly 1.

format long g
N = 5; % number of snapshots
x = 1/2*(randn(N, 1) +1i*randn(N,1))
x = 
           0.31826065419565 +     0.330095747891705i
           0.60894562845089 -     0.508294585280675i
           0.53808867995462 +     0.165203806256833i
          0.366851554172884 +     0.396291041754433i
          0.590254515096041 +     0.239182134838587i
xMean = mean(x)
xMean = 
          0.484480206374017 +     0.124495629092177i
xVariance = var(x)
xVariance = 
         0.150599228588032

First approach: modify the values as a group

x2 = x - xMean;
xMean2 = mean(x2)      %verify it is exactly 0, to within round-off error
xMean2 = 
       5.55111512312578e-17 -  1.11022302462516e-17i
x2 = x2 ./ sqrt(xVariance);
xVariance = var(x2)
xVariance = 
                         1
x2
x2 = 
         -0.428322347689403 +     0.529800041062169i
          0.320728344478462 -      1.63060354009506i
          0.138140832165457 +     0.104898742565564i
         -0.303111034809338 +     0.700375182805113i
          0.272564205854824 +     0.295529573662214i

The number of degrees of freedom of the original x is 10 -- N*2 independent random values.

The number of degrees of freedom of the modified group, x2, is only 8: although each of the values is still random, together as a group they are correlated in ways that reduces the freedom.

Second approach: take only initial entries and generate the remaining values as whatever is necessary to make the group fit. This is proving to be more difficult than I thought it would be.

15 Comments
Show 13 older commentsHide 13 older comments

Sadiq Akbar on 30 May 2021

Thank you very much for your kind response. Actually I have a known covariance matrix Rxx. I have also found its eigen-vector matrix E and eigen-value matrix D. The next task was to generate a Gaussain random complex vector of N elements of zero mean and variance 1 so that I can implement eq.(11). i.e.,

x(tl)=ED^1/2 z(tl)---------------(11)

in which the Gaussian random compex vector of N elements of zero mean and variance must be such that its covariance is identity matrix I. So when we multiply that Gaussian vector in eq.(11), we will get snapshots(i.e., samples) of vector x(t) at time=1. Then again we will find another Gaussian random vector of same N values and will multiply it in eq.(11) to find next samples of the vector x(t) at t=2 and so on. But I was stuck at how to generate the Gaussain random complex vector of N elements. My implementation is as below:

clear all; close all; clc
Rxx= [3.1000 + 0.0000i  -0.7556 + 0.9467i   2.0871 - 1.6529i   0.8300 + 1.9440i   0.2419 - 1.7573i
     -0.7556 - 0.9467i   3.1000 + 0.0000i  -0.7556 + 0.9467i   2.0871 - 1.6529i   0.8300 + 1.9440i
      2.0871 + 1.6529i  -0.7556 - 0.9467i   3.1000 + 0.0000i  -0.7556 + 0.9467i   2.0871 - 1.6529i
      0.8300 - 1.9440i   2.0871 + 1.6529i  -0.7556 - 0.9467i   3.1000 + 0.0000i  -0.7556 + 0.9467i
      0.2419 + 1.7573i   0.8300 - 1.9440i   2.0871 + 1.6529i  -0.7556 - 0.9467i   3.1000 + 0.0000i];
[E,D]=eig(Rxx) % eigen value decomposition
N=5;
z = 1/sqrt(2)*(rand(N, 1) +1i*rand(N,1));
zMean=mean(z); %For checking zero mean
zVariance=var(z); %For checking variance 1
Rzz= z*z'/length(z); % This is just for verification that Rzz comes as an Identity matrix or not
% If Rzz comes out to be an identity matrix, then uncomment below line to
% find sampples of x(t)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% xx=E*sqrt(D)*z;

Now I am stuck at the condtion that Gaussian random complex vector must be such that its covariance Rzz is an identity matrix because if it is so, then I can find my desired samples of x(t).

Walter Roberson on 30 May 2021

You have contradictory requirements.

When you sample a Gaussian random distribution for a finite number of samples, you will almost never get a mean of 0 and a variance of 1.

Consider this:

format long g
N = 1000000;
rolls = randi(6, 1, N);

Statistically the mean of a independent rolls of a fair M-sided dice is (M+1)/2, so (6+1)/2 = 3.5 exactly. Statistically. But let us look at those million rolls:

mean(rolls)
ans = 
                  3.500818

That is not 3.5 exactly. It approaches 3.5, but will not be exactly that. As you take rolls it will fluctuate around the mean:

cummean = cumsum(rolls)./(1:N);

plot(cummean(1:100)-3.5)

title('first 100')

plot(cummean-3.5)

title('overall')

How long to settle down?

lastidx = find(cummean < 3.4 | cummean > 3.6, 1, 'last')

lastidx =

689

plot(cummean(1:lastidx)-3.5)

title('until it settles to +/- 0.1')

It is interesting to note that by the time it settles to within +/- 0.1 in this particular run, it had not crossed between cumulative mean above and below 3.5 -- but by looking at the long term mean over 1 million rolls, we can see that it must have done so.

Likewise, when you take finite samples of two distributions, then their covariance matrices will not be the identity matrix: the covariance matrix approaches the identity matrix statistically. If you had done those million rolls and the mean was exactly 3.5 then you should suspect that the dice were biased.

Walter Roberson on 30 May 2021

I have here a penny. It is a 1987 USA penny. I will flip it 5 times once, and then another 5 times.

format long g
H = 0;
T = 1;
A = [H, H, H, H, T]
A = 1×5
     0     0     0     0     1
B = [T, T, H, H, T]
B = 1×5
     1     1     0     0     1

Now let us check the covariance:

cov(A, B)
ans = 2×2
                       0.2                       0.1
                       0.1                       0.3

Not the diagonal matrix. On the other hand, this is not mean 0 and variance 1, so:

A1 = A - mean(A);
B1 = B - mean(B);
A1 = A1 ./ std(A1);
B1 = B1 ./ std(B1);
[mean(A1), mean(B1)]
ans = 1×2
     0     0
[var(A1), var(B1)]
ans = 1×2
                         1                         1
cov(A1, B1)
ans = 2×2
                         1         0.408248290463863
         0.408248290463863                         1

... and still not the identity matrix.

What is the interpretation of this? Well, you might say that this proves that the first 5 flips were not independent of the second 5 flips

What about if we had more entries?

C = randi([0 1], 10000, 1);
D = randi([0 1], 10000, 1);
C1 = C - mean(C);
D1 = D - mean(D);
C1 = C1 ./ std(C1);
D1 = D1 ./ std(D1);
[mean(C1), mean(D1)]
ans = 1×2
     -5.21360732363973e-16     -7.74491581978509e-17
[var(C1),  var(D1)]
ans = 1×2
         0.999999999999984         0.999999999999966
cov(C1, D1)
ans = 2×2
                         1      -0.00840291247161445
      -0.00840291247161445          0.99999999999999

10000 entries each, and the covariance is still not the diagonal matrix. It approaches the diagonal matrix. But if we had only used the first 5 entries of each?

C15 = C(1:5) - mean(C(1:5));
D15 = D(1:5) - mean(D(1:5));
C15 = C15 ./ std(C15);
D15 = D15 ./ std(D15);
[mean(C15), mean(D15)]
ans = 1×2
                         0     -8.88178419700125e-17
[var(D15),  var(D15)]
ans = 1×2
                         1                         1
cov(C15, D15)
ans = 2×2
                         1         0.408248290463863
         0.408248290463863                         1

... even further from being an identity matrix.

Is it even possible with 5 flips?

bin = dec2bin(0:31, 5) - '0';
nbin = size(bin,1);
for K = 1 : nbin;
  bin(K,:) = (bin(K,:) - mean(bin(K,:)))./std(bin(K,:));
end
[R1, R2] = ndgrid(1:nbin);
all_cov = arrayfun(@(r1, r2) cov(bin(r1,:), bin(r2,:)), R1, R2, 'uniform', 0);
is_ident = cellfun(@(M) isequal(M, [1 0; 0 1]), all_cov);
any(is_ident(:))
ans = logical
   0

Nope, it is not possible. And this is not due to floating point round-off. The closest you can get in this situation is a covariance matrix of [1 1/6; 1/6 1]

It is of course nonsense to say that no matter which two sets of 5 coin flips you might have had, that covariance test "proves" that the two sets must be interdependent.

But it would not be nonsense to say that what the book is talking about is statistical indepedence of the samples, rather than talking about taking a short finite set of samples and insisting that the mean() is 0 and var() is 1 and cov() is the identity matrix.

Instead of measuring over a short finite set of samples, you should look at tests such as https://www.mathworks.com/help/stats/ttest2.html . And if the statistical tests cannot prove that the samples are from different distributions, then you should go ahead and assume that the true covariance matrix is the identity matrix.

Sadiq Akbar on 31 May 2021

Thank you very much for your further detaied explanation. And have I already said that you are rigth. But I don't know why this senior professor has said like that in his book? I am still confused not by you but by the professor.

Paul on 2 Jun 2021

Edited: Paul on 2 Jun 2021

@Sadiq Akbar

I think I understand what your professor is saying and where your confusion lies. For now, I'm going to assume that all the variables in question are real, not complex, which may make it easier to illustrate the concepts.

In this problem, the goal is to obtain L observations of a random variable vector, X, of length N, such that the L observations of X, denoted as x1 - xL (note the lower case x to distinguish observations from the random variable) are drawn from an N-dimensional normal distribution with a theoretical expected value of zeros(N,1) and theoretical covariance Rxx.

Let Z be a random variable vector of length N, with normal, N-dimensional probability density function with expected value Muz = zeros(N,1) and theoretical covariance Rzz = eye(N). Let X = E*sqrt(D)*Z, where E and D are the eigenvectors and diaganal matrix of eigenvalues of Rxx respectively. With this relationship between random variable vectors X and Z, X will have the desired properties: normal, Mux = zeros(N,1), and theoretical covariance Rxx.

So how do we obtain observations x1 - xL? The approach is to generate observations z1 - zL and the apply E and sqrt(D) to those observations. Then the observations xi will be as if they were obtained by sampling from a distribution of X.

As Walter showed, the observation of the so-called practical mean computed from the L observations z will not be zeros(n,1) and the observed practical covariance will not be eye(N). (As an aside, I've never heard that term "practical," but that's o.k.) Similarly, the observed value of the practical mean computed from observations x will not be zeros(n,1) and the observed value of the practical covariance Rhatxx will not equal Rxx. But as long as you're properly generating the observations z in the first place, the whole procedure is giving you what you want, and you should not modify any of the data to force the observed practical mean to be zero or the observed practical covariance to be Rxx. Doing so will corrupt the statistical properties of the data such that downstream calculations will be suspect.

Let's look at an example, with

N = 2;
Rxx = [1 .5;.5 2]; % theoretical covariance

Let's try the procedure using L = 10 observations of Z. Because the theoretical covarance of Z is Rzz = eye(N), we know that the elements of Z are indepdendent so we can generate Z with randn

L = 10;
rng(100);
z = randn(N,L); % L observations of an N-dimensional vector

Now generate the observations x of X

[E,D] = eig(Rxx);
x = E*sqrt(D)*z;

Now compute the observed practical mean of x, which is an estimate of the expected value of X, i.e., Mux

muhatx = mean(x,2)
muhatx = 2×1
    0.1170
    0.2706

As expected, the observed practical mean is not zeros(2,1). In fact, the practical mean itself is a random vector that has its own probability distribution.

Now compute the observed practical covariance Rhatxx, which is an estimate of Rxx. We can do this two ways: using Matlab's cov() function or using the equation in the notes you posted. Note that the outer product is scaled by L, the number of observations.

rhatxx1 = cov(x.') % transpose needed because cov wants the observations going down the rows
rhatxx1 = 2×2
    1.1967    1.6810
    1.6810    3.3193
rhatxx2 = x*x'/L  % scaled outer product
rhatxx2 = 2×2
    1.0907    1.5445
    1.5445    3.0606

Neither of these are equal (or even close) to Rxx; again the elements of Rhatxx are themselves random variables with their own distributions. But why aren't the observed covariances equal to each other? There are two reasons. cov() subtracts muhatx from the observations, and cov() scales by L-1. Check:

rhatxx3 = (x-muhatx)*(x-muhatx)'/(L-1) % equals rhatxx1
rhatxx3 = 2×2
    1.1967    1.6810
    1.6810    3.3193

I think this means that cov() isn't making any assumptions about the distribution of X, whereas the outer product is based on knowing that Mux = zeros(2,1). The point to take away is that the observations in x can be treated as if they were drawn from a distribution with theoretical mean Mux = zeros(2,1) and covariance Rxx, even if the observed mean muhatx and covariance rhatxx are different than the theoretical values. In fact, the probability that Rhatxx (Muhatx) will equal Rxx (Mux) is zero. To see this point more concretely, increase L to generate more observations. Doing so we expect muhatx and rhatxx to come closer to Mux = zeros(2,1) and Rxx, but the estimates will never be exact. But that's o.k. All you want to do is convince yourself that the observations x, however many there are, are properly viewed as being taken from the distribution of X, with parameters Mux and Rxx.

L = 10000;
z = randn(N,L); % L observations of an N-dimensional vector
x = E*sqrt(D)*z;
muhatx = mean(x,2)
muhatx = 2×1
   -0.0052
   -0.0074
rhatxx1 = cov(x.') % transpose needed because cov wants the observations going down the rows
rhatxx1 = 2×2
    1.0045    0.4929
    0.4929    1.9590
rhatxx2 = x*x'/L  % scaled outer product
rhatxx2 = 2×2
    1.0044    0.4929
    0.4929    1.9589
rhatxx3 = (x-muhatx)*(x-muhatx)'/(L-1) % equals Rhatxx1
rhatxx3 = 2×2
    1.0045    0.4929
    0.4929    1.9590
rhatxx3 - rhatxx1
ans = 2×2
	1.0e+-15 *

    0.2220   -0.1110
   -0.1110   -0.4441

Note that by increasing L we weren't guaranteed that rhatxx and muhatx would be better estimates as compared to L = 10, but that was very likely to be the case.

In summary: don't modify your observations in an attempt to enforce the theoretical mean and covariance on the observed values. Doing so will corrupt your statistical results if you run your model over and over, as in a Monte Carlo simulation.

Now, in your problem you're dealing with complex, N-dimensional random vectors. The only thing I'm not sure about is what the notes mean where Z is defined just above equation 12. I'm not sure there is enough information there to uniquely define the distribution of Z. Maybe that doesn't matter, i.e., any Z that satisfies eqn(12) will do. One possibility, as discussed in the other answer, is Z = Zr + 1i*Zi, where Zr and Zi are each N-dimensional, real, random vecotrs with N-dimensional, normal distribution with expected value mu = zeros(N,1) and covariance Rzz = eye(N)/2.

Sign in to comment.

How to generate a Gaussian random complex vector of N elements

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

9 Comments
Show 7 older commentsHide 7 older comments

15 Comments
Show 13 older commentsHide 13 older comments

See Also

Categories

Tags

Community Treasure Hunt

How to generate a Gaussian random complex vector of N elements

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

9 Comments Show 7 older commentsHide 7 older comments

15 Comments Show 13 older commentsHide 13 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

9 Comments
Show 7 older commentsHide 7 older comments

15 Comments
Show 13 older commentsHide 13 older comments