Indexing after knnsearch with GPU is slow

4 views (last 30 days)
I was trying to speed up my model using GPU. Part of the code running on CPU is like the following
m = 500;
n = 500;
dx = 2;
x = 1 : n; %col
y = 1 : m; %row
[xx yy] = meshgrid(x,y);
%
A = ones(m,n);
A(1,:) = 0;
%
indwatershed = find(A==1); % watershed pixels
indchannel = find(A==0); % channel pixels
%
xchannel = xx(indchannel);
ychannel = yy(indchannel);
PC = [xchannel ychannel];
%
xwatershed = xx(indwatershed);
ywatershed = yy(indwatershed);
PW = [xwatershed ywatershed];
%
tic
[loc, mdxy] = knnsearch(PC,PW); % find the nearest channel pixel to each watershed pixel
toc
%
tic
Y1 = ychannel(loc); % indexing
toc
The time cost for knnsearch and indexing is:
Elapsed time is 0.098330 seconds.
Elapsed time is 0.001529 seconds.
Then I transfer this code to GPU like the following:
m = 500;
n = 500;
dx = 2;
x = 1 : n; %col
y = 1 : m; %row
% all the parameters are transfered to GPU
m = gpuArray(m);
n = gpuArray(n);
dx = gpuArray(dx);
x = gpuArray(x);
y = gpuArray(y);
%
[xx yy] = meshgrid(x,y);
%
A = ones(m,n,'gpuArray');
A(1,:) = 0;
%
indwatershed = find(A==1); % watershed pixels
indchannel = find(A==0); % channel pixels
%
xchannel = xx(indchannel);
ychannel = yy(indchannel);
PC = [xchannel ychannel];
%
xwatershed = xx(indwatershed);
ywatershed = yy(indwatershed);
PW = [xwatershed ywatershed];
%
tic
[loc, mdxy] = knnsearch(PC,PW); % find the nearest channel pixel to each watershed pixel
toc
%
tic
Y1 = ychannel(loc); % indexing
toc
And the time cost for knnsearch and indexing is:
Elapsed time is 0.005452 seconds.
Elapsed time is 0.145393 seconds.
This means that knnsearch is mush faster on GPU than CPU, but the following indexing is much slower.
Then I add a wait() function between knnsearch and the indexing:
dev = gpuDevice; % new lines
m = 500;
n = 500;
dx = 2;
x = 1 : n; %col
y = 1 : m; %row
%
m = gpuArray(m);
n = gpuArray(n);
dx = gpuArray(dx);
x = gpuArray(x);
y = gpuArray(y);
%
[xx yy] = meshgrid(x,y);
%
A = ones(m,n,'gpuArray');
A(1,:) = 0;
%
indwatershed = find(A==1); % watershed pixels
indchannel = find(A==0); % channel pixels
%
xchannel = xx(indchannel);
ychannel = yy(indchannel);
PC = [xchannel ychannel];
%
xwatershed = xx(indwatershed);
ywatershed = yy(indwatershed);
PW = [xwatershed ywatershed];
%
tic
[loc, mdxy] = knnsearch(PC,PW); % find the nearest channel pixel to each watershed pixel
toc
%
tic
wait(dev)
toc
%
tic
Y1 = ychannel(loc); % indexing
toc
I get:
Elapsed time is 0.007852 seconds.
Elapsed time is 0.146666 seconds.
Elapsed time is 0.000470 seconds.
The wait() function took a long time! But all the arrays and parameters are working in GPU. How can this happen? I will be so appreciated if anyone can help to resolve this problem.

Accepted Answer

Joss Knight
Joss Knight on 3 Jan 2022
wait just asks the GPU to finish executing any pending operations, in this case, the call to knnsearch. Your previous timing code was invalid because you did not do this; instead the cost of call to knnsearch was bundled in with the indexing call, since the variable loc needed to finish being computed before that line of code could be executed.
In general, use gputimeit for accurately timing GPU code.
  2 Comments
xu fan
xu fan on 4 Jan 2022
Hello Joss. Thank you for your answers. I have tested gputimei:
f = @() knnsearch(PC,PW);
t = gputimeit(f)
and it returns 0.1260s.
I have also checked the explanation of wait(gpudevice) in the mannual:
"wait(gpudev) blocks execution in MATLAB® until the GPU device identified by the GPUDevice object gpudev completes its calculations. This can be used before calls to toc when timing GPU code that does not gather results back to the workspace. When gathering results from a GPU, MATLAB automatically waits until all GPU calculations are complete, so you do not need to explicitly call wait in that situation."
I'm not quite understand how GPU calculation works. When I run the code with a breakpoint at the line of wait(), I can find the result "loc" (i.e., the results of knnsearch) in the workspace. But when I subsquently use "loc" for indexing, the programs seems to freeze for about 0.1s, which I think is the wait() time. My question is what this extra waiting time represent? If this waiting time is spend for the communication between the gpu and the local workspace and if it is possible to be avoided? Many thanks!
Joss Knight
Joss Knight on 4 Jan 2022
GPU operations run asynchronously where possible. This means that the computation runs in the background while MATLAB continues to process the next line of code. If you attempt to display the contents of an array that is pending evaluation (in the debugger or the workspace for instance) then MATLAB will automatically complete execution in order to provide that data - in other words, it will call wait for you. In general this behaviour should be entirely hidden from the user - it's an internal optimisation. However, it does cause potential confusion when you attempt to time your code.
This page of the documentation gives some tips on how to time your code correctly.

Sign in to comment.

More Answers (0)

Categories

Find more on Image Processing and Computer Vision in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!