Remove an element from a vector, using most computationally efficient solution
61 views (last 30 days)
Show older comments
Hello, I recently wrote a function that removes elements from common vectors. It is a fairly elegant and simple piece of code, and it operates great when dealing with vectors that contain tens of thousands of elements; however, when I execute this code on vectors with tens of millions of elements, my code takes an exponentially longer time to execute (~1 minute vs ~1 week). I am wondering if I am using the "best memory practice" or most efficient solution, or if you guys can recommend an alternate method. Also, a point of note is that I am using 90% of the memory available to my machine (8 GB).
Specifically,
currentLength = length(vector1)
while i ~= currentLength+1
if vector1(i) == someCondition
% Removing data from all common indices of vector1, vector2, vector3, etc, through the varargin command
vector1(i) = [];
for i2 = 1:1:numOfVariableInputs
varargin{i2}(i) = [];
end
i = i - 1;
currentLength = currentLength-1;
end
i = i + 1;
end
I might be wrong, but I believe that using the =[ ] element removal technique that I have mentioned is continuously creating and destroying temporary variables to reduce the indices of the vectors.
Any help or thoughts to improve my code execution would be greatly appreciated!
Cheers,
- Matt
0 Comments
Accepted Answer
More Answers (1)
Adam
on 5 Jul 2016
Edited: Adam
on 5 Jul 2016
Are you not able to create a logical vector pointing to the elements that should be removed and then remove them all at once at the end?
You haven't included details of exactly what conditions you are testing so it is difficult to judge, but usually you wouldn't want to be resizing a vector while you are looping around it. Even if you correctly adjust your indices to make sure you don't overflow it is just very inefficient and I am not surprised its performance is exponential.
If instead you simply take not of the indices that should be deleted then you just delete them in a single command after the loop. This does mean maintaining another vector of the same length which is an issue for memory, but can you not divide your vectors up into smaller chunks to deal with this?
e.g.
toDelete = false( size( vector1 ) );
for i = 1:numel( vector1 )
if vector1(i) == someCondition
toDelete(i) = true;
end
end
vector1( toDelete ) = [];
vector2( toDelete ) = [];
etc.
Obviously if you can vectorise your condition rather than running a loop then it would be even faster, but that depends what the condition is you are testing against.
0 Comments
See Also
Categories
Find more on Data Type Identification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!