Remove an element from a vector, using most computationally efficient solution

61 views (last 30 days)
Hello, I recently wrote a function that removes elements from common vectors. It is a fairly elegant and simple piece of code, and it operates great when dealing with vectors that contain tens of thousands of elements; however, when I execute this code on vectors with tens of millions of elements, my code takes an exponentially longer time to execute (~1 minute vs ~1 week). I am wondering if I am using the "best memory practice" or most efficient solution, or if you guys can recommend an alternate method. Also, a point of note is that I am using 90% of the memory available to my machine (8 GB).
Specifically,
currentLength = length(vector1)
while i ~= currentLength+1
if vector1(i) == someCondition
% Removing data from all common indices of vector1, vector2, vector3, etc, through the varargin command
vector1(i) = [];
for i2 = 1:1:numOfVariableInputs
varargin{i2}(i) = [];
end
i = i - 1;
currentLength = currentLength-1;
end
i = i + 1;
end
I might be wrong, but I believe that using the =[ ] element removal technique that I have mentioned is continuously creating and destroying temporary variables to reduce the indices of the vectors.
Any help or thoughts to improve my code execution would be greatly appreciated!
Cheers,
- Matt

Accepted Answer

Thorsten
Thorsten on 5 Jul 2016
Edited: Thorsten on 5 Jul 2016
If your "someCondition" is indepent of variable "i" in your original post, you can do it like this:
idx = vector1 == someCondition;
vector1(idx) = [];
for i = 1:numOfVariableInputs
varargin{i}(idx) = [];
end
  3 Comments
Matt C
Matt C on 5 Jul 2016
Wonderful - Thanks for the simplification, guys! My code now executes in about 30 seconds instead of the predicted 1-week. Gotta save the bits and bytes wherever we can!

Sign in to comment.

More Answers (1)

Adam
Adam on 5 Jul 2016
Edited: Adam on 5 Jul 2016
Are you not able to create a logical vector pointing to the elements that should be removed and then remove them all at once at the end?
You haven't included details of exactly what conditions you are testing so it is difficult to judge, but usually you wouldn't want to be resizing a vector while you are looping around it. Even if you correctly adjust your indices to make sure you don't overflow it is just very inefficient and I am not surprised its performance is exponential.
If instead you simply take not of the indices that should be deleted then you just delete them in a single command after the loop. This does mean maintaining another vector of the same length which is an issue for memory, but can you not divide your vectors up into smaller chunks to deal with this?
e.g.
toDelete = false( size( vector1 ) );
for i = 1:numel( vector1 )
if vector1(i) == someCondition
toDelete(i) = true;
end
end
vector1( toDelete ) = [];
vector2( toDelete ) = [];
etc.
Obviously if you can vectorise your condition rather than running a loop then it would be even faster, but that depends what the condition is you are testing against.

Categories

Find more on Data Type Identification in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!