How to eliminate equal matrices that are created randomly inside loop?

1 view (last 30 days)
The code segment I'm working on is given below:
NphaseSteps = 6;
phases = exp( 2*pi*1i * (0:(NphaseSteps-1))/NphaseSteps );
i = 1;
while i <= 10 %number of iterations
ind = randi([1 NphaseSteps],10,10);
inField{i} = phases(ind);
save('inField.mat', 'inField')
i = i + 1;
end
Now, what I want is to keep track of these randomly created matrices "inField{i}" and eliminate the ones that are equal to each other. I know that I can use "if" condition but since I'm new to programming I don't know how to use it more efficiently so that it doesn't take too much time. So, I need your help for a fast working program that does the job. Thanks in advance.

Accepted Answer

Walter Roberson
Walter Roberson on 22 Sep 2018
A = phases;
n = 10; m = 10; %each output will be 10 x 10
rows_needed = 10; %you need 10 of them
cols_needed = n*m;
permidx = [];
more_needed = rows_needed;
while more_needed > 0
trial_idx = randi([1 NphaseSteps], more_needed, cols_needed);
permidx = unique([permidx; trial_idx], 'rows', 'stable');
more_needed = rows_needed - size(permidx,1);
end
randperms = A(permidx);
inField = cellfun(@(M) reshape(M, m, n), num2cell(randperms, 2), 'uniform', 0);
  6 Comments
Bruno Luong
Bruno Luong on 23 Sep 2018
Edited: Bruno Luong on 23 Sep 2018
They probably use Fisher–Yates or Knuth shuffle algorithm for randperm. TMW might realizes the drawback of the old one-parameter randperm (based on SORT) when there is a lot of discussion about lacking of such efficient algorithm in the case k<<n. Jan have a nice FEX implementation of such shuffle.
Your algo is certainly not sub-exponential in all case since it stucks forever for some circumstances (already given above).
Walter Roberson
Walter Roberson on 23 Sep 2018
The code I gave in this original Answer was never intended to have to deal with the case where the number of rows being requested exceeded the number of unique rows possible. It would not be difficult to put in
assert(rows_needed <= NPhaseSteps^(m*n), 'more rows requested than uniquely exist')
It would probably start to get seriously inefficient at roughly 90% of NPhaseSteps^(m*n)
Of course, 90% of 6^(10*10) would far exceed the amount of memory anyone is likely to have.
For NPhaseSteps up to 255 and modifying to
trial_idx = randi([1 NphaseSteps], more_needed, cols_needed, 'uint8');
then the permidx matrix would not be the major memory roadblock: the major memory roadblock would be A(permidx) as that would be 8 times as large since A is double precision. The num2cell of that would take even more memory as cell is not especially efficient.
If we postulate that the user has 64 gigabytes of memory, and making the simplification of assuming that the num2cell version only takes the same amount of memory as the A(permidx) array, then as there would temporarily be two arrays of maximum size, each array could be no more than 32 gigabytes. The permidx matrix would be 1/8th as large, so 4 gigabytes. Each row would be 100 bytes... so you could request on the order of 2^25 rows before running out of memory in that circumstance.
There are algorithms that would generate directly into cells rather than generating into memory and needing to duplicate that into a cell, but I worry about the cost of the uniquifying step. Hmmmm, maybe if you used a hash table of NPhaseSteps^(m*n) bits... That would only require about 1/8th of the total baryons in the universe for the hash table.

Sign in to comment.

More Answers (0)

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!