Find row in matrix, fast and with special conditions
2 views (last 30 days)
Show older comments
I have a matrix (S) and one row out of that matrix (s). I want to get the rownumber of (s) in (S).
(S) has unique rows. The row (s) is allways member of (S)-rows.
So far I calculate the row number with:
I = find(all(bsxfun(@eq,S,s),2));
I am looking for a faster method. I call this line alot, it costs me hours of calculation time.
Here's a test example:
% create example, N and n could change
n = 4; % n < 10
N = 25; % range(N) ~ [20 100]
S = nchoosek(1:N,n);
for i = 1:10
s = S(randi(size(S,1)),:);
tic
I = find(all(bsxfun(@eq,S,s),2));
toc
end
[Edit2], poorly expressed "(S) has unique rows". It should express that I is allways scalar.
[Edit], dimensional details.
My example shows the general direction of the sizes.
[d1 d2] = size(S); % its [12650x4] in my example
First dimension will be much larger then the second one. (d1 >> d2)
Dimension two (d2=n) will be smaller then 10. Numel(S) will be limited by Memory.
1 Comment
Jan
on 25 May 2013
What is the usual dimension of S? It matters if you are talking about 10x4, 10'000x20 or 20x10'000 matrices.
Accepted Answer
Jan
on 25 May 2013
Edited: Jan
on 25 May 2013
If the first dimension of S is large, the BSXFUN approach checks of a lot of numbers although they are rejected in the former columns already. A small Matlab function can avoid testing already rejected rows:
function Index = FindRow(S, s)
nCol = size(S, 2);
match = S(:, 1) == s(1);
for col = 2:nCol
match(match) = S(match, col) == s(col);
end
Index = find(match);
For Image Analyst's example under Matlab 2009a/64/Win7:
S = randi(3, 200, 3);
s = [2 3 1];
tic, for k=1:1000; m = find(all(bsxfun(@eq,S,s),2)); end; toc
tic; for k=1:1000; m = find(ismember(S, s, 'rows')); end; toc
tic; for k=1:1000; m = FindRow(S, s); end; toc
% 0.032688 sec
% 0.433527 sec
% 0.021643 sec
But for S = randi(30, 2000, 20), s=S(1000, :):
% 0.124566 sec
% 3.175940 sec
% 0.194101 sec
[EDITED] It is easier in C to stop the search after the first matching row is found. This saves 50% processing time in the average. In addition no temporary arrays are required:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *S, *s;
mwSize n1, n2, i1, i2, k;
S = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
n2 = mxGetN(prhs[0]);
s = mxGetPr(prhs[1]);
for (i1 = 0; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}
if (i2 == n2) { // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}
For S = randi(30, 2000, 20) I get (compiled by MSVC2008):
BSXFUN: 0.123305 sec
FindRow.m: 0.194914 sec
ISMEMBER: 3.172425 sec
FindRow.mex: 0.011098 sec (s is last row)
0.007880 sec (s is 1000th row)
0.004753 sec (s is first row)
The MEX has an average speedup of factor 15. For a [20000 x 200] matrix we get a factor of 260 already.
3 Comments
Jan
on 27 Dec 2013
Dear Jacob L,
The request you've sent me by email would have been a perfect question for this forum. Therefore I answer here, how to modify this function to work for UINT8 arrays:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
uint8_T *M, *v;
mwSize m, n, i1, i2, k;
if (nrhs != 2 || !mxIsUint8(prhs[0]) || !mxIsUint8(prhs[1])) {
mexErrMsgIdAndTxt("MEX:FindRow:BadInput",
"2 UINT8 arrays required as input.");
}
M = (uint8_T *) mxGetData(prhs[0]);
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
v = (uint8_T *) mxGetData(prhs[1]);
if (mxGetNumberOfElements(prhs[1]) != n) {
mexErrMsgIdAndTxt("MEX:FindRow:BadSize",
"Sizes of inputs do not match.");
}
for (i1 = 0; i1 < m; i1++) {
k = i1;
for (i2 = 0; i2 < n; i2++) {
if (M[k] != v[i2]) {
break;
}
k += m;
}
if (i2 == n) { // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleMatrix(0, 0, mxREAL);
}
Mark
on 13 Mar 2015
Hi,
I used your your original mex file in matlab and tried to change it. I'm also searching for fast ways to find a row matrix s in a large matrix S. However the row matrix s appears a number of times in large matrix S.
if true
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[], mwSize, lastfound)
{
double *S, *s *lastfound;
mwSize n1, n2, i1, i2, k;
S = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
// n2 = mxGetN(prhs[0]);
n2 = 3;
s = mxGetPr(prhs[1]);
for (i1 = lastfound; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}
if (i2 == n2) { // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}
end
So i tried to use another input 'lastfound' which would enable the code to start at lastfound + 1 (the +1 is added when the function is called), so that the previous row s is found, and the code start searching the second row s. I do this until the mex file return NaN. It however doesn't work, the code doesn't start the at the lastfound + 1 step. I am a C noob, so i think it is a very small error, but i would immensely appreciate it if somebody could take a look at it.
Thanks in advance :)
More Answers (2)
Image Analyst
on 25 May 2013
Try this:
% Generate some sample data:
m = randi(3, 200, 3)
% Specify what we're looking for:
lookingFor = [2 3 1]; % Let's say we're looking for this
% Find out what rows that pattern occurs in:
rowsItOccursIn = find(ismember(m, lookingFor, 'rows'))
5 Comments
Image Analyst
on 25 May 2013
What are the actual sizes of n, N, and S? For 4 and 25, it DEFINITELY should not take "hours of calculation time" like you said.
Teja Muppirala
on 26 May 2013
Not sure if this is faster, but might be worth a try.
d1 = 12650;
d2 = 4;
S = randn(d1,d2);
s = S(randi(d1),:);
c = 1;
r = 1;
while c <= d2
if S(r,c) == s(c)
c = c+1;
else
r = r+1;
c = 1;
end
end
isequal(S(r,:),s)
If the big S is the same every time, you could make a better algorithm by sorting it in advance.
1 Comment
See Also
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!