Same gpu operation in loop but two speeds
Show older comments
I am running the code below on Windows 10 64 bits with an intel i5-8300H and a Nvidia GTX 1060:
dev = gpuDevice;
time = zeros(1, length(t)); % preallocation of time
time2 = zeros(1,length(t)); % preallocation of time2
for i = 2:length(t)-1 %looping over time vector
wait(dev); tic;
% Computation of dot product between A(:, i:-1:1) and C(:,1:i,1) + C(:,1:i,2) for each row
approx_conv_pair = sum(reshape(A(txN-i*N+1:txN-N).*(C(1:(i-1)*N) + C(txN+1:txN+(i-1)*N)),[N,i-1]),2);
wait(dev); time(i)=toc;
wait(dev); tic;
% Computation of dot product between B(:, i:-1:1) and C(:,1:i,1) - C(:,1:i,2) for each row
approx_conv_impair = sum(reshape(B(txN-i*N+1:txN-N).*(C(1:(i-1)*N) - C(txN+1:txN+(i-1)*N)),[N,i-1]),2);
wait(dev); time2(i)=toc;
end
A, B and C are gpuArrays of size, respectively, (N, length(t)) (N, length(t)) and (N, length(t), 2) .
My indexing simply accesses A(:, i:-1:2), B(:, i:-1:2), C(:, 1:i-1, 1) and C(:, 1:i-1, 2) as column vectors.
Here's what I obtain when I plot time and time2 with respect to looping iterator i when N=252 and length(t) = 13200:

Does anybody knows why there's such a difference between execution times?
Is it due to my way of coding or something linked to overhead time on the GPU?
FYI I tried to invert the order of approx_conv_pair and approx_conv_impair and I observe the same problem (second operation almost twice as fast).
Thank you in advance!
Accepted Answer
More Answers (0)
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!