intersection of multiple arrays

93 views (last 30 days)
Ananya Malik
Ananya Malik on 14 Sep 2017
Commented: Ananya Malik on 14 Sep 2017
I have multiple 1D arrays with integers. Eg. A=[1,2,3,4] B=[1,4,5,6,7] C=[1,5,8,9,10,11]. Here 1 is repeated in all the arrays and 4 in A&B, 5 in B&C. I want these repeated values to be assigned to a single array, based on the size of the array, and removed from other arrays. How can I achieve this. Please help. TIA.
  5 Comments
KL
KL on 14 Sep 2017
So you only wanto to remove 1 from all the matrices in the above example?
Ananya Malik
Ananya Malik on 14 Sep 2017
Edited: Ananya Malik on 14 Sep 2017
No. The common elements are [1,4,5]. Upon removing them from all arrays, we get A=[2,3] B=[6,7] and C=[8,9,10,11]. 1 can be present in either A or B. Suppose we put 1 in A ==> A=[1,2,3] B=[6,7] C=[8,9,10,11]. 4 can be present in A or B. Size(B)<size(A), therefore A=[1 2 3] B=[4,6 7] c=[8,9,10,11]. Finally 5 can be present in B or C, but size(B)<size(C). Therefore the final output should be A=[1,2,3] B=[4,5,6,7] and C=[8,9,10,11].

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 14 Sep 2017
This is in no way guaranteed to give you a perfectly balanced output. After that you need to look at optimisation algorithm which is beyond my field. This is also a lot less efficient than my other solution:
C = {[1 2 3 4 5 6], [1 4 5 6 7], [1 5 8 9 12 20]}
[uvals, ~, bin] = unique(cell2mat(C));
origarray = repelem(1:numel(C), cellfun(@numel, C));
dist = table(uvals', accumarray(bin, 1), accumarray(bin, origarray, [], @(x) {x}), 'VariableNames', {'value', 'repetition', 'arrayindices'});
dist = sortrows(dist, 'repetition');
The above builds a table of all the unique values, how many times they're repeated and where they come from originally. You can then iterate through that to distribute the values:
%distribute non-repeated values first
newC = accumarray(cell2mat(dist.arrayindices(dist.repetition == 1)), ...
dist.value(dist.repetition == 1), [], @(x){x'});
%and remove them from table
dist(dist.repetition == 1, :) = [];
%distribute remaining values one at a time
for row = 1:height(dist)
destarrays = dist.arrayindices{row}; %which array originally had the current value?
[~, destidx] = min(cellfun(@numel, newC(destarrays))); %find which is currently smallest
newC{destarrays(destidx)} = [newC{destarrays(destidx)}, dist.value(row)]; %append value to that smallest array
end
celldisp(newC)
I'm using a table here, which is not the most efficient container in matlab, but that make it easier to understand the code.
  1 Comment
Ananya Malik
Ananya Malik on 14 Sep 2017
Thank you so much. Not exactly what I wanted, but pretty close. Will tweak it further for my use. Cheers

Sign in to comment.

More Answers (2)

KL
KL on 14 Sep 2017
A=[1,2,3,4];
B=[1,4,5,6,7];
C=[1,5,8,9,10,11];
M = {A,B,C};
N = {[B C], [A C], [A B]};
CommonElements = unique(cell2mat(cellfun(@intersect, M,N,'UniformOutput',false)));
NewM = cellfun(@(x) removeEl(x,CommonElements),M,'UniformOutput',false);
%and then
function x = removeEl(x,a)
for el=1:length(a)
x(x==a(el))=[];
end
end

Guillaume
Guillaume on 14 Sep 2017
This is how I'd do it:
C = {[1 2 3 4], [1 4 5 6 7], [1 5 8 9 10 11]};
%sort cell array by increasing vector size:
[~, order] = sort(cellfun(@numel, C));
C = C(order);
%get all unique values
uvals = unique(cell2mat(C));
%go through all arrays, keeping all values in uvals, then removing them from uvals so they can't be used by the next array
for iarr = 1:numel(C)
C{iarr} = intersect(C{iarr}, uvals); %only keep the values in uvals
uvals = setdiff(uvals, C{iarr}); %and remove the one we've just used
end
celldisp(C)
  1 Comment
Ananya Malik
Ananya Malik on 14 Sep 2017
Edited: Ananya Malik on 14 Sep 2017
I am trying to balance the number of elements in the arrays. So if I change my input to A=[1 2 3 4 5 6], B=[1 4 5 6 7] and C=[1 5 8 9 10 11]. Using the code you provided gives me A=[1 4 5 6 7] B=[2 3] and C=[8 9 10 11]. Whereas ideal case would be A=[2 3 4 6] B=[1 7 5] and C=[8 9 10 11] or A=[1 2 3 4] B=[5 6 7] C=[8 9 10 11] or something along this line.

Sign in to comment.

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!