How to find the index of the first element which is greater than a particular value in a matrix for all the columns.
    67 views (last 30 days)
  
       Show older comments
    
    DARSHAN KUMAR BISWAS
 on 10 Jun 2022
  
    
    
    
    
    Commented: Image Analyst
      
      
 on 11 Jun 2022
            Suppose I have a 300 by 300 matrix, containg the values from 0 to 200. For each column, I want to find the position of the first element which is greater than 50. how do I do it. I have tried to write a code but can't do it any further.
no_row=size(c,1);
no_col=size(c,2);
for i=1:no_col
    for j=1:no_row
        if c(j,i)>=50
        end
    end
end
0 Comments
Accepted Answer
  Stephen23
      
      
 on 10 Jun 2022
        
      Edited: Stephen23
      
      
 on 10 Jun 2022
  
      N = 5;
M = randi(9,5,7)
Method one: CUMSUM and FIND:
X = M>N;
X = X & cumsum(X,1)<2;
[R,~] = find(X)
Method two: FIND and ACCUMARRAY: 
[Rx,Cx] = find(M>N);
R = accumarray(Cx,Rx,[],@min)
Method three**: NUM2CELL and CELLFUN (fails if there is no such element in any column):
F = @(v)find(v>N,1,'first');
R = cellfun(F,num2cell(M,1))
5 Comments
  Stephen23
      
      
 on 11 Jun 2022
				" except if they need fast code"
Which is only some of the time. In most cases, getting the correct result has a higher priority.
  Jan
      
      
 on 11 Jun 2022
				@Stephen23: Of course the methods shown above produce the same result. The dull loop/loop method is 305 times faster than the cellfun approach and does not fail on missing values. I am impressed by this speed, although I actually use Matlab to avoid such low level programming.
I assume, it is much more difficult for a beginner to write a working accumarray appraoch than to implement the nested loop appraoch correctly. So speed and valid code are arguments for the low level method. Unfortunately. 
More Answers (4)
  Image Analyst
      
      
 on 10 Jun 2022
        Try this.  It's a loop but only 300 iterations for it's fast, completing in 0.001 seconds:
% Prepare sample image because user forgot to upload the data.
grayImage = 200 * rand(300);
subplot(1, 2, 1);
imshow(grayImage, []);
[rows, columns] = size(grayImage);
% Now scan across columns finding the first row where the value exceeds 200.
tic
topRows = nan(1, columns);
for col = 1 : columns
    t = find(grayImage(:, col) > 50, 1, "first");
    if ~isempty(t)
        topRows(col) = t;
    end
end
toc % Takes 0.001 seconds for me
% Plot the location of the top rows.
subplot(1, 2, 2);
plot(topRows, 'r.', 'MarkerSize', 15);
grid on;
xlabel('Column');
ylabel('First row where value > 200')

3 Comments
  Image Analyst
      
      
 on 10 Jun 2022
				It runs in 0.0046 here in the forum using MATLAB online.  I've brought this up before.  You'd think the Answers server would use a very powerful computer so why is it like 4 or 5 times slower than my laptop?  They basically said that they expect that, and it was not surprising, and that's just the way it is.
  Jan
      
      
 on 11 Jun 2022
				Thanks, @Image Analyst, I've found the source of the different speeds: The online interpreter in the forum seems to let the JIT work more powerful in functions than in scripts:
s1 = 300;
s2 = 300;
M  = randi([0, 200], s1, s2);
N  = 50;
tic
for rep = 1:1e4
    R = nan(s2, 1);
    for k = 1:s2
        v = find(M(:, k) > N, 1);
        if ~isempty(v)
            R(k) = v;
        end
    end
end
toc  % Code in the main script:
tic;
R = fcn(M, s1, s2, N);
toc  % Identical code inside a function:
function R = fcn(M, s1, s2, N)
for rep = 1:1e4
    R = nan(s2, 1);
    for k = 1:s2
        v = find(M(:, k) > N, 1);
        if ~isempty(v)
            R(k) = v;
        end
    end
end
end
I'm surprised that this nested loop is 35 times faster (even inside a script):
R = nan(s2, 1);
for k = 1:s2
    for kk = 1:s1
        if M(kk, k) > N
            R(k) = kk;
            break;
        end
    end  
end
  Mathan
 on 10 Jun 2022
        
      Edited: Mathan
 on 10 Jun 2022
  
      [no_row,no_col] = size(c);
Index_final = [];
for ii = 1:no_col
    Index =[];
    for jj=1:no_row
        if c(jj,ii)>=50
            Index = [Index jj];  % Index contains all the elements greater than 50 in that particular column
        end
    end
    if (sum(Index)>0)
        Index_final = [Index_final Index(1)]; % Index_final should contain all the required elements in your case.
    end
end
5 Comments
  Jan
      
      
 on 10 Jun 2022
				@Mathan: Avoid iteratively growing arrays, because tzhey are very expensive. Although this is hardly a problem for a tiny array as a [300x300] matrix, but it is worth to use methods, which run for large problems also.
x = [];
for k = 1:1e6
    x = [x, k];
end
This creates a new array in each iteration and copies the former values. Although the final result uses 8 MB only (8 byte per element), the Matlab has to allocate and copy sum(1:1e6)*8 Byte = 4 TeraByte ! This is a massive waste of processing time. The solution is easy:
x = zeros(1, 1e6);   % Pre-allocate!!!
for k = 1:1e6
    x(k) = k;
end
Now the new value is inserted in the existing array.
Instead of proceeding to compare the elements, you can use break after the first match.
  Jan
      
      
 on 10 Jun 2022
        
      Edited: Jan
      
      
 on 10 Jun 2022
  
      s1 = 300;
s2 = 300;
C  = randi([0, 200], s1, s2);   % Test data
N  = 50;                        % Search
R = nan(s2, 1);        % Pre-allocate the output array
for i2 = 1:s2          % Loop over columns
  for i1 = 1:s1        % Loop over rows
    if C(i1, i2) >= N  % Compare the element
       R(i2) = i1;     % Store in output
       break;          % Stop "for i1" loop
    end
  end  
end
This is a loop approach equivalent to C or even Basic. Actually Matlab offers more elegant command, see Stephen23's answer. But these "vectorized" methods are slower in modern Matlab versions, because they have to create temporary arrays. The CPU is much faster than the RAM (especially if the data sizes exceed the 1st and 2nd level caches). So this ugly code is much faster than the other suggestions.
This is the case for larger arrays also: I've tested it with 30'000x30'000 matrices.
0 Comments
  DARSHAN KUMAR BISWAS
 on 11 Jun 2022
        2 Comments
  Image Analyst
      
      
 on 11 Jun 2022
				Yeah, lots of room for improvement in that code.
- I'd split A onto separate lines for readability, breaking after the semicolons.
- To be more general and robust, I'd get the number of rows and columns of A using size, and use those sizes in the "for" lines and in the call to zeros().
- You need "hold on" after plot(), plus you need an (x,y) pair.
- No need to use a for loop and break when you can simply use find
See my Answer above to see better coding practice.
See Also
Categories
				Find more on Loops and Conditional Statements in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



