How can I index elements by their sorted positions?

I have a question regarding indexing vector elements according to their position in a sorted list. The usual query I have seen is more like this, where the secondary output of sort() is sufficient - but not for my purpose.
The problem (using R2019a in Windows 10):
I have a vector (which could be arbitrarily long, with an arbitrary number of sets of replicants, each of which could have an arbitrary number of elements).
A = [20 23 1 19 20 8 5 14 9 7];
and can happily sort it:
>>B = sort(A)
B =
[1 5 7 8 9 14 19 20 20 23]
I want to create another vector C whose ith element describes the index position within B of the ith element of A. Duplicates should be handled by their original appearance order.
i.e., the desired C = [8 10 1 7 9 4 2 6 5 3]
C therefore has the properties: sort(C) = 1:size(A), and size(unique(C)) = size(C).
for ii = 1:size(A,1)
C(ii) = find(B,A(ii))
end
almost works. It doesn't handle duplicate values though, and returns C(5) = 8 instead of 9. (It's also non-vectorised and therefore presumptively ugly.)
A nastier test selection would be
>>A = [42 23 23 7 9 7 8 1 10 23 9 9 1 2 23 23 16 18 7 9];
>>B = sort(A)
B =
[ 1 1 2 7 7 7 8 9 9 9 9 10 16 18 23 23 23 23 23 42];
with corresponding
C = [20 15 16 4 8 5 7 1 12 17 9 10 2 3 18 19 13 14 6 11]
I'd appreciate any ideas on how to handle duplicates - or direction to the appropriate question/answer/function if my GoogleFu isn't up to scratch.

 Accepted Answer

Sort has a second output that is the index;
[B,I] = sort(A);
such that A(I) = B;
From there, I is the C_ matrix you are looking for with non-unique elements shown as well. You can extract the unique element indeces using the [temp,index]=unique(B) function on B to obtain the indeces of B that are unique and equate C = C_(index).
Working Solution;
A = [20 23 1 19 20 8 5 14 9 7];
[B,idx] = sort(A) ;
indA = 1:length(A);
[temp,idB] = sort(idx);
C = indA(idB)

5 Comments

Hi Aquatris,
Thanks for responding so quickly. However, as I mentioned, the secondary output of sort() is not what I am after in this case.
>>A = [20 23 1 19 20 8 5 14 9 7];
>>[B,idx] = sort(A)
B =
[1 5 7 8 9 14 19 20 20 23]
idx =
[3 7 10 6 9 8 4 1 5 2]
Note that idx ~= C;
C = [8 10 1 7 9 4 2 6 5 3]
What I am looking for is the converse indexing - instead of asking where the elements of B occur in A, I am asking where the elements of A occur in B.
I think you can achieve what you want with the sort function;
A = [20 23 1 19 20 8 5 14 9 7];
[B,idx] = sort(A) ;
indA = 1:length(A);
[temp,idB] = sort(idx);
C = indA(idB)
Chapeau, Aquatris. Beautiful solution. Works perfectly on the nastier test case too.
Thanks!
Out of curiosity (and self-examination), why did you mention the [temp,index] = unique(B) function in the initial solution description?
Also, I want to accept the solution in your comment above. Could you please copy it into a new answer?
You mentioned you wanted only the unique values in B. With the unique function, you can identify the unique element indeces in B and extract those indeces from the "idB". This way you can assign the unique value indeces to the C instead of everything.
Ah, I see the confusion. I was trying to specify that C should not treat unique values within A the same as each other, since I wanted them counted in appearance order. It was a check that could be used to verify a solution.
Anyway, thanks again. I spotted the edit and accepted the answer.

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!