A bonus question would be how to extend the solution to 3D, i.e. if the matrix A is 3D, so each slice in the 3rd dimension has only unique values (per row) and fillers. This is more complex also because the resulting matrix size per slice of 3rd dimension is different, so more fillers need to be added on each 3D slice to that they all have the same dimension as the highest dimension in 3D, but I accept the answer to my original question alone and for this I could use a for loop and adding fillers to always fit the largest dimension so far before appending the 3D slice
Unique elements in matrix efficiently
18 views (last 30 days)
Show older comments
Given a matrix A, what is an efficient way to obtain a matrix B that consists of only the unique elements in A (where by unique, I mean row-wise, therefore the 'ismember' function is not suitable).
For example:
A = [1 2 3 1; 4 1 6 7; 8 9 8 11]
A =
1 2 3 1
4 1 6 7
8 9 8 11
Since each row contains different number of unique elements than other rows, the matrix of unique elements would have rows each of possibly different dimensions, which is not possible. Therefore, I can fill the repeated/non-unique elements with some dumb filler values (since I am working with only positives I can replace them with a negative number (eg -1), or if it had negative numbers too, they could be replaced by NaN values.
The result therefore can be:
B =
1 2 3 -1
4 1 6 7
8 9 -1 11
(where the negative -1s could be replaced by NaNs alternatively).
Notice that although A(2, 2) has a value (1) that already exists in the previous column (A(1, 1)), it is still unique in its own row, therefore the 'ismember' function cannot be applied.
I have created a solution, but I can imagine there are more elegant and more efficient solutions using vectorization and avoiding for loops, for when matrix A is very large which happens to be the case:
B = A(:, 1);
for i = 2:size(A, 2)
NEWMEMBERS = !sum(bsxfun(@eq, A(:, i), B), 2);
NEWCOL = NEWMEMBERS .* A(:, i);
FILLER = -1 * ~NEWMEMBERS;
NEWCOL = NEWCOL + FILLER;
B = [B NEWCOL];
end
(FILLER can be more generally replaced by a vector of 0 and NaNs instead of 0s and -1s)
3 Comments
John D'Errico
on 26 Feb 2016
Edited: John D'Errico
on 26 Feb 2016
Note that this question does not actually contain validly executable MATLAB code, having constructs like +=, and ! in the code.
Answers (3)
Titus Edelhofer
on 26 Feb 2016
Hi,
a rather simple version would be this:
B = -ones(size(A));
for row=1:size(A, 1)
val = unique(A(row,:));
[~,idx] = ismember(val, A(row,:));
B(row,idx) = val;
end
I haven't tried though what happens if A is large ...
Titus
0 Comments
Stephen23
on 26 Feb 2016
Edited: Stephen23
on 26 Feb 2016
Without any loops (and can be easily adapted to use a tolerance):
A = [1 2 3 1; 4 1 6 7; 8 9 8 11]
S = size(A);
[B,C] = sort(A,2);
D = [false(S(1),1),diff(B,1,2)==0];
R = (1:S(1))'*ones(1,S(2));
X = sub2ind(S,R(D),C(D));
A(X) = NaN
displays this in the command window:
A =
1 2 3 NaN
4 1 6 7
8 9 NaN 11
George Aipal
on 26 Feb 2016
5 Comments
Titus Edelhofer
on 26 Feb 2016
Hi George,
the filler comes before by setting
B = nan(size(A));
or
B = -ones(size(B));
or whatever.
My advice to customers is usually: write the code in a way that is simple to write and simple to read. When it's progressed and you identify bottlenecks, then start investigating by tic/toc or profiler. Don't get me wrong, a good deal of my work is teaching vectorization (one of my favorite underused functions is bsxfun). But writing unreadable vectorized code without need I try to avoid. And if I do, I add as comment the simpler/loop version so that someone else (or myself) understand what's happening.
Titus
See Also
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!