How to find elements in an array faster / without using for loop?
48 views (last 30 days)
Show older comments
Jan Brychta
on 3 Feb 2022
Commented: Jan Brychta
on 5 Feb 2022
Hi,
I have the following working code with a for loop but I want to make the process faster. For the sizes of arrays I use this process now takes up to 30 seconds.
The code:
neighbour is a X by 2 array with integers only (for example 65000 x 2)
squares is a Y by 4 array with integers only (for example 35000 x 4)
B = zeros(squares,1); %the preallocation I tried - not much helpful, minimal time saving
for i = 1:length(neighbour) % for loop going though values from 1 to length of 'neighbour' array ~ for example 1:65000
B = any(squares == (neighbour(i,1)),2) & any(squares == (neighbour(i,2)),2);
% this finds indicies of lines in 'squares' where there are both values from 'i'th row of 'neighbour' array
end
If not clear from the code what I want to do is:
I want to go though the 'neighbour' array row by row and obtain the indicies of lines in 'square' array which contain the values as in that row in neighbour array.
Example:
if the neighbour array had only 1 row with
[1 2]
in it, and the square array looked like this:
[ 4 58 6 7;
1 2 47 48;
84 12 8 9],
then the output should be the index of the line in square array which contains both numbers i.e.
2
I have tried preallocation but the time it saves is marginal. Do you have any ideas on how to make this faster, ideally without a for loop?
Many thanks,
Jan
0 Comments
Accepted Answer
Turlough Hughes
on 5 Feb 2022
Edited: Turlough Hughes
on 5 Feb 2022
In the example you provided you aren't actually storing any of the indices. It is also important to consider that you will get results where more than one neighbour matches a square, or none match a square at all - you're going to need to store the indices in a cell array. To go through some different approaches, lets first generate some similar data:
neighbour = randi(1000,65000,2);
squares = randi(1000,35000,4);
B = cell(height(squares),1);
I've done three approaches to the problem the first one being based on the example you provided.
Approach 1 based on the example you provided:
tic
for i = 1:length(neighbour)
B{i} = find( ...
any(squares == (neighbour(i,1)),2) &...
any(squares == (neighbour(i,2)),2)...
);
end
toc
% Elapsed time is 33.780513 seconds. (Mathworks Server)
% Elapsed time is 21.165682 seconds. (My PC)
It's taking about 21 seconds on my computer, we can get some improvent with approach 2.
Approach 2 Instead of using &, it's faster to index into squares with the first logical expression. In this way, you're only scanning a portion of squares for neighbour(i,2) instead of the whole array, that is a significant improvement. In a sense, this is the vector equivalent of logical short-circuiting.
B = cell(height(squares),1);
tic
for i = 1:length(neighbour)
idx = find(any(squares == (neighbour(i,1)),2));
B{i} = idx(any(squares(idx,:) == (neighbour(i,2)),2));
end
toc
% Elapsed time is 21.825993 seconds. (Mathworks Server)
% Elapsed time is 12.312727 seconds. (My PC)
Approach 3 It turns out that any(someArray,1), is faster than any(someArray,2), which isn't surpising as MATLAB is column major-order. With some modification we can get another improvement.
B = cell(height(squares),1);
tic
squares = squares.';
for i = 1:length(neighbour)
idx = find(any(squares == neighbour(i,1),1));
B{i} = idx(any(squares(:, idx) == (neighbour(i,2)),1)).';
end
toc
%Elapsed time is 11.630071 seconds. (Mathworks Server)
%Elapsed time is 6.896441 seconds. (My PC)
So for me that was about a 3x improvement, and it does about a 3x improvement on MathWorks servers as well.
Edit: find needed to be used on the first logical expression in approaches 2 and 3.
4 Comments
Turlough Hughes
on 5 Feb 2022
Jan, that was a mistake, I should have used find on the first logical expression. Approaches 2 and 3 now match up with approach 1. I've retimed everything in the edit above.
More Answers (1)
Christopher McCausland
on 3 Feb 2022
5 Comments
Christopher McCausland
on 4 Feb 2022
Hi Jan,
The fact that your data isn't the same size is a little problematic! I would suggest zero padding but I think this will cause more problems as neighbour would probably never be found in the larger matrix (square).
One option might be to use reshape() to reshape square to the same size as neighbour. I think re-arrainging neigbhour and using [Lia,Locb] = ismember(Neighbour,Square,’rows’) will probably be your best bet.
Let me know how you get on,
Christopher
See Also
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!