Optimized code for loop, If-statement for large dataset
1 view (last 30 days)
Show older comments
I was hoping to delete some certain rows using condion. My data is in double format (790127*24) I approximate the total code need 25 hours using Run and Time which is huge. Is there any way of optimiing the script.
TIA...
n=0;
for i = 1 : length(d_A)
if any(isnan(d_A(i-n, 6))) ...
&& any(isnan(d_A(i-n, 7))) ...
&& any(isnan(d_A(i-n, 8))) ...
&& any(isnan(d_A(i-n, 9))) ...
&& any(isnan(d_A(i-n,10))) ...
&& any(isnan(d_A(i-n,11))) ...
&& any(isnan(d_A(i-n,12))) ...
&& any(isnan(d_A(i-n,13))) ...
&& any(isnan(d_A(i-n,14))) ...
&& any(isnan(d_A(i-n,15))) ...
&& any(isnan(d_A(i-n,16))) ...
&& any(isnan(d_A(i-n,17))) ...
&& any(isnan(d_A(i-n,18))) ...
&& any(isnan(d_A(i-n,19)))
d_A(i-n,:) = [];
n=n+1;
end
end
0 Comments
Accepted Answer
per isakson
on 25 May 2019
Edited: per isakson
on 26 May 2019
Try this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
d_A(i-n,:) = [];
n=n+1;
end
end
and this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i-n) = true;
n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
Caveat: not tested
In response to comment:
Now it's possible to factor out the for-loop. Try this
%% Sample data
A = rand( [8,4] );
ixcol = [2,3];
A([3,5],ixcol) = nan;
A( randperm( numel(A), 9 ) ) = nan;
%%
[ A3, ix_deleted3 ] = cssm_3( A, ixcol );
[ A4, ix_deleted4 ] = cssm_4( A, ixcol );
ix_deleted3 == ix_deleted4 %#ok<NOPTS,EQEFF>
function [ A, ix_deleted ] = cssm_3( A, ixcol )
is_to_be_deleted = false( size(A,1), 1 );
for jj = 1 : length(A)
if all( isnan( A( jj, ixcol )))
is_to_be_deleted(jj) = true;
end
end
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
function [ A, ix_deleted ] = cssm_4( A, ixcol )
is_to_be_deleted = all( isnan( A( :, ixcol ) ), 2 );
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
it outputs
>> cssm
ans =
2×1 logical array
1
1
The vectorized version, cssm_4, might not improve performance significantly, but in my opinion it makes cleaner code.
2 Comments
per isakson
on 26 May 2019
Edited: per isakson
on 26 May 2019
I surmised that there was a problem and added the last line in bold.
It's as a bad for performance to remove one line at a time as adding one line at a time. In both cases the matrix is rewritten to memory in each operation.
More Answers (0)
See Also
Categories
Find more on Matrices and Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!