# How to find duplicate values and what are they duplicates of?

4 views (last 30 days)
Sim on 4 Oct 2012
Suppose I have an array
X = [2;5;1;2;2;0;0]
With unique I get the unique values and the indices of the duplicates but I also want the indices of what they are duplicates of.
[r, ir, ix] = unique(X);
r = [0;1;2;5]
ir =[7;3;5;2]
ix = [3;4;2;3;3;1;1]
But here I get indices of last occurance. What I am looking for is:
r = [0;1;2;5]
rIdx = [6,7 ; 3; 1,4,5; 2]
So returned value would have the unique values but also have the indices of where those values appear.
Is there any solution to this?

Matt Fig on 4 Oct 2012
Edited: Matt Fig on 4 Oct 2012
There are many ways to do it. Here is another, just for fun:
X = [2;5;1;2;2;0;0];
Y = arrayfun(@(x) {x find(X==x)},unique(X),'Un',0);
Now Y{1}{1} is the unique element and Y{1}{2} is the indices of it's location. Y{2}{1} is the next unique element and Y{2}{2} is the indices of its location, etc. You could also leave out the unique elements and just go with:
Y = arrayfun(@(x) find(X==x),unique(X),'Un',0);
Or if you prefer to include the unique elements but not have them in a seperate cell array, this makes it so that the first element of each cell is the unique element of X:
Y = arrayfun(@(x) [x;find(X==x)],unique(X),'Un',0);
Also, my use of ARRAYFUN here may be taken as shorthand for the FOR loops that are probably faster....
You might also be interested in this question.
Matt Fig on 4 Oct 2012
Sim,
I would expect array1 and array2 to give different results if Values and Dates are different! This has nothing to do with the reliability of ARRAYFUN. The function does exactly what you tell it to do, which is what we mean when we want to know if a function is reliable.
Like I said, a FOR loop is probably faster, but here is one way to do it:
X = [2;5;1;2;2;0;0];
Y = arrayfun(@(x) [x,find(X==x).'],unique(X),'Un',0);
m = max(cellfun('length',Y));
m = cellfun(@(x) [x,nan(1,m-length(x))],Y,'Un',0);
m = cell2mat(m)