How to find "k" nearest elements that meet a condition from an element in a categorical column vector.

2 views (last 30 days)
I have a categorical column vector "predClass.mat" whose classes are 'N', 'A', and 'V' and the size is 2046x1.
The mat file "predClass.mat" has been attached.
1. I want to find all indices of 'N' first.
2. For each found 'N', I want to find "k" nearest elements of 'N', where "k" can be, 1,2,..., any positive integer.
It should work at least for k=30 unless there is not enough 'N'.
3. I want to exclude any 'N' in the neighbors that meets a condition, that is: 'A' or 'V' is located at just the previous index of any 'N'.
So, any 'N' that is located right after 'A' or 'V' must be excluded.
The final neighbors of each 'N' can be different and less than "k" depending on how many 'N's are removed.
The code can be created using an simple data below before employing the attached mat-file.
predClass = categorical(["N","N","A","N","N","N","A"]');
Thank you very much for your help.

Accepted Answer

Voss
Voss on 23 Mar 2024
Edited: Voss on 23 Mar 2024
Try this:
predClass = categorical(["N";"N";"A";"N";"N";"N";"A"]);
k = 2;
% find where there is an 'N' immediately preceded by an 'A' or a 'V'
bad_idx = 1 + find(predClass(2:end) == 'N' & (predClass(1:end-1) == 'A' | predClass(1:end-1) == 'V'));
% count those
k_add = numel(bad_idx);
% find all the 'N's in predClass
idxN = find(predClass == 'N');
% going to search for k+k_add+1 'N's because up to k_add will be removed
% for being in bad_idx (preceded by 'A' or 'V') and 1 will be removed
% for being itself
% limit the search value to be <= 1 less than the total number of 'N's in
% predClass (1 less to exclude each 'N' itself from its set of neighbors)
% and >= 0 (in case there are 0 'N's in predClass)
kk = max(0,min(k+k_add,numel(idxN)-1));
% perform mink search on the distance between the location of each 'N'
% and the location of all 'N's
[~,idx] = mink(abs(idxN-idxN.'),kk+1,1);
% remove the first row of idx, corresponding to distances of 0
idx(1,:) = [];
% convert idx from locations in idxN to locations in predClass
idx = idxN(idx);
% mark the "bad" 'N's as 0 in idx
idx(ismember(idx,bad_idx)) = 0;
% make each column an element of a cell array because we're about to remove
% a varying number of elements from each column (can't do that with a
% matrix and have it remain a matrix - each column might be of different
% length)
C = num2cell(idx,1);
% keep only the non-0s
C = cellfun(@(x)x(x~=0),C,'UniformOutput',false);
% keep only up to the 1st k elements in each column
C = cellfun(@(x)x(1:min(k,end)),C,'UniformOutput',false);
% make the result into a matrix, if possible
try
result = [C{:}]
catch
result = C
end
result = 2x5
2 1 5 6 5 5 5 2 2 2

More Answers (0)

Categories

Find more on Categorical Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!