Sparse and binary mask

Question

0 votes

I am working on saving computations and speeding up my Matlab code. Here's what I do Perform thresholding on an image and make most of its components as zero run it inside a loop with element by element multiplication with another matrix(full matrix) of the same size

Here is a detailed version of what I am doing

CASE I:When I dont avoid multiplication with zeros

 matrix(matrix<threshold) = 0 ;
 while the condition is true
 matrix = matrix.*phase;
 other lines of code ;
 end

Result: From MATLAB profiler I see that matrix = matrix.*phase; takes 80 seconds

CASE II:When I avoid multiplication with zeros using SPARSE

matrix(matrix<threshold) = 0 ;
matrix = sparse(matrix);
while the condition is true
matrix = matrix.*phase;
other lines of code ;
end

Result: From MATLAB profiler I see that matrix = matrix.*phase takes 110 seconds

CASE III:When I avoid multiplication with zeros using a mask

matrix(matrix<threshold) = 0 ;
mask = matrix~=0;
while the condition is true
matrix(mask) = matrix(mask).*phase(mask);
other lines of code ;
end

Result: From MATLAB profiler I see that matrix = matrix.*phase takes 130 seconds

Why is my code slow if I try to avoid multiplications with zeros? Since the matrix after thresholding has 75% zeros my speed should be up by 75% as I am avoiding multiplication with zeros. But my time is increased Is there any other way to achieve avoiding multiplication with zeros and speeding up of the code?

MY PRIORITY IS TO SPEED UP THE CODE EXECUTION AND SAVE MUTLIPLICATIONS

4 Comments
Show 2 older comments Hide 2 older comments

James Tursa on 14 Sep 2018

What is the class of your image variable? Do you have a supported C compiler installed for a mex routine?

D_coder on 14 Sep 2018

class is double,C compiler is present

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Bruno Luong on 15 Sep 2018

Edited: Bruno Luong on 15 Sep 2018

Open in MATLAB Online

1 vote

Instead of matrix(mask) = matrix(mask).*phase(mask)

you might work on index

idx = find(mask & matrix~=0);
while ...
   matrix(idx) = matrix(idx).*phase(idx);
end

You might want to compute phase(idx) outside the loop if they do not change, or update idx if the matrix elements become zero. As you don't give any detail, we have no idea if those are applicable for you.

If matrix does not share any copy, you might do element inplace replacement using MEX.

17 Comments
Show 15 older comments Hide 15 older comments

D_coder on 20 Sep 2018

Open in MATLAB Online

how does the accumarray function that you wrote changes when I want to do sum of individual columns instead of individual rows.The idx find creates a problem here as it finds index column wise

a = [ 0 0 3 0; 2 5 0 1;0 0 0 3];
b  = [ 4 3 2 1; 5 3 6 2; 7 1 -2 3];
[r c] = size(a);
prod = a.*b; 
row_sum = sum(prod,2);%must have same result as sum_accum
idx = find(a);
ared = a(idx);
bred = b(idx);
cred = ared.*bred;
row = ceil(idx/c);
sum_accum = accumarray(row,cred,[r 1]);%must have same result as row_sum
both sum_accum and row_sum give different result.

Bruno Luong on 20 Sep 2018

Edited: Bruno Luong on 20 Sep 2018

Open in MATLAB Online

a = [ 0 0 3 0; 2 5 0 1; 0 0 0 3];
b  = [ 4 3 2 1; 5 3 6 2; 7 1 -2 3];
idx = find(a)
[m,n] = size(a);
col = floor((idx-1)/m)+1;
sumcol = accumarray(col,a(idx).*b(idx)).'
% check
prod = a.*b;
sum(prod,1)
row = mod((idx-1),m)+1;
sumrow = accumarray(row,a(idx).*b(idx))
% check
sum(prod,2)

Sign in to comment.

Answer 2

James Tursa on 15 Sep 2018

Edited: James Tursa on 15 Sep 2018

Open in MATLAB Online

2 votes

Try this simple C mex routine. It avoids the overhead you are creating with your sparse alternate stuff. Could maybe be improved further by multi-threading for large variables but it is currently not coded that way. (Caution, no input checking so it will crash MATLAB if not used properly)

/*
 * File: times_threshold.c
 *
 * C = times_threshold(A,B,threshold)  does
 *
 * C = A .* B but only where A > threshold (C = 0 in the other spots)
 *
 * A = full real double variable
 * B = full real double variable (same size as A)
 * C = full real double variable (same size as A)
 * threshold = numeric scalar
 *
 * CAUTION: Bare bones with no argument checking! Will crash MATLAB if the
 *          inputs are not exactly as expected.
 */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    double *Apr, *Bpr, *Cpr;
    double threshold;
    mwSize n;
      n = mxGetNumberOfElements(prhs[0]);
      plhs[0] = mxCreateNumericArray(mxGetNumberOfDimensions(prhs[0]),
                                     mxGetDimensions(prhs[0]),
                                     mxDOUBLE_CLASS, mxREAL);
      Apr = mxGetPr(prhs[0]);
      Bpr = mxGetPr(prhs[1]);
      Cpr = mxGetPr(plhs[0]);
      threshold = mxGetScalar(prhs[2]);
      while( n-- ) {
          if( *Apr > threshold ) {
              *Cpr = *Apr * *Bpr;
          }
          Apr++; Bpr++; Cpr++;
      }
  }

8 Comments
Show 6 older comments Hide 6 older comments

D_coder on 15 Sep 2018

it wont be efficient as in my case i calculate threshold only once

James Tursa on 16 Sep 2018

What code are you using that is "more efficient" than the mex routine?

Sign in to comment.

Answer 3

Walter Roberson on 15 Sep 2018

Open in MATLAB Online

1 vote

matrix<threshold and matrix~=0 both generate values the same size as matrix. matrix(mask) and phase(mask) and matrix(mask).*phase(mask) generate values with the same number of entries as the number of non-zero values in mask. Those variables get stored in memory and have to be processed.

At the MATLAB level, matrix = matrix.*phase gets dispatched to the high performance libraries that operate on multiple cores. One temporary value the same size as matrix would be involved to hold the results.

The high performance libraries would probably be scheduling multiple multiplies simultaneously, perhaps with SIMD type instructions, possibly able to graduate one result per cycle, or whatever the maximum graduation rate is.

When you select part of the array it has to be conditionally moved to other memory locations, and then that array processed, and then selectively moved back. That involves memory bandwidth and comparisons and loops. Those cost, and the tradeoffs are potentially just not worth the effort compared to SIMD or vector instructions.

You would face different tradeoffs in your FPGA.

To reduce multiplications, it might be worth creating an explicit loop, along the lines of

new_matrix = zeros(size(matrix));
for idx = 1 : numel(matrix)
  M = matrix(idx); P = phase(idx);
  if M && P; new_matrix(idx) = M .* P;
end

If phase is certain to be non-zero then

new_matrix = zeros(size(matrix));
for idx = 1 : numel(matrix)
  M = matrix(idx);
  if M; new_matrix(idx) = M .* phase(idx);
end

9 Comments
Show 7 older comments Hide 7 older comments

Walter Roberson on 15 Sep 2018

Open in MATLAB Online

new_matrix = zeros(size(matrix));
for idx = 1 : numel(matrix)
  M = matrix(idx);
  if M ~= 0; new_matrix(idx) = M .* phase(idx); end
end

D_coder on 15 Sep 2018

this is not working it is taking very long time

Sign in to comment.

Sparse and binary mask

4 Comments
Show 2 older comments Hide 2 older comments

Accepted Answer

17 Comments
Show 15 older comments Hide 15 older comments

More Answers (2)

8 Comments
Show 6 older comments Hide 6 older comments

9 Comments
Show 7 older comments Hide 7 older comments

Categories

Tags

Community Treasure Hunt

Sparse and binary mask

4 Comments Show 2 older comments Hide 2 older comments

Accepted Answer

17 Comments Show 15 older comments Hide 15 older comments

More Answers (2)

8 Comments Show 6 older comments Hide 6 older comments

9 Comments Show 7 older comments Hide 7 older comments

Categories

Tags

See Also

Community Treasure Hunt

4 Comments
Show 2 older comments Hide 2 older comments

17 Comments
Show 15 older comments Hide 15 older comments

8 Comments
Show 6 older comments Hide 6 older comments

9 Comments
Show 7 older comments Hide 7 older comments