Find the number of unique rows and its corresponding index in a matrix

Hi there, I need to list the unique rows and its corresponding index in a matrix without the using the unique function. I have been suggested to use the unique function but it seems rather time consuming and I have to run many iterations. Any help will be appreciated. Cheers, Vishal
Data:
v1=[1 2 3 4 5; 5 3 3 2 1; 1 2 4 9 7; 1 2 3 4 5; 5 3 3 2 1]
[a,b,c]=unique(v1,'rows','stable')
The required output which I get from the unique function is:
a =
1 2 3 4 5
5 3 3 2 1
1 2 4 9 7
b =
1
2
3
c =
1
2
3
1
2

7 Comments

Well, that that's what unique is for is a pretty good reason to have had it suggested... :)
My only initial thought of anything better would be to do the search/indexing at the time the array is populated instead of afterwards. If it comes from an external source and that process can't be modified this doesn't help, of course.
I cannot imagine, that unique is "rather time consuming" for such small problems. If you have to solve millions of such small problems, or the real data contain millions of rows, it is essential to explain this explicitly, because we cannot find a solution for a problem, which has not been mentioned.
I have a 121*5 matrix over 12100000 iterations which needs to be run 10 times and I have 5 such simulations where; 12100000 iterations takes aprox. 3.5 hours to run. The profile viewer indicates that 80% of the time is consumed by the unique function. This is after I removed the circshift and intersect function which were also time consuming.
Do all elements of the 121x5 matrix have small integer values?
Yes just integer values 1 to 10. Am I be doing something wrong for this function to take so long?
Well, it's a big job from scratch every time.
Is this simulation the selection of a subset each time that I seem to recall something about in an earlier posting? If so, I'd suggest the solution may be to do it once for the complete set and the use that and select from it for each randomized sample as well as from the data. That is if the data itself aren't changing, only the sample space.
How long does it take? I don't think UNIQUE will take a lot of time...and my test is running 20 times the UNIQUE line with your v1, the running time is always less than 0.005 seconds.

Sign in to comment.

 Accepted Answer

Perhaps this is more efficient:
v1 = [1 2 3 4 5; 5 3 3 2 1; 1 2 4 9 7; 1 2 3 4 5; 5 3 3 2 1];
vX = v1 * [1, 11, 121, 1331, 14641];
[dummy, b, c] = unique(vX, 'stable');
a = v1(b, :);
unique() has a remarkable overhead, so you could create a local copy of it and remove the not required checks of the inputs etc.

1 Comment

Cheers mate. Shaved off 5% of the computing time. Small typo in line 2
v1 = [1 2 3 4 5; 5 3 3 2 1; 1 2 4 9 7; 1 2 3 4 5; 5 3 3 2 1]; vX = v1 * [1; 11; 121; 1331; 14641]; [dummy, b, c] = unique(vX, 'stable'); a = v1(b, :);

Sign in to comment.

More Answers (0)

Asked:

on 27 Aug 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!