I am trying to compute the ifft2 of a multiple matrices. The simplete code snippet is:
gAs = gpuArray.rand(999, 519, 20);
gBs = gpuArray.rand(999, 519);
ifft2(gAs .* gBs, "symmetric");
Error using gpuArray/ifft2
An invalid array was used on the GPU.
I thought that I was using all the GPU memory. I tried using single GPU arrays but it However, I then tried the following code (bigger matrix) and worked just fine.
gAs = gpuArray.rand(1000, 519, 2);
gBs = gpuArray.rand(1000, 519);
ifft2(gAs .* gBs, "symmetric");
I know that I can also do a for-loop through gAs slices and it works but I want to get some speedup by doing it in one call to ifft2.
I wanted to understand why this is happening and if there is a way in which I can pad the matrices so that I can still get the ifft2 of the original matrices.
For reference:
>> gpuDevice()
ans =
CUDADevice with properties:
Name: 'Tesla V100-SXM2-32GB'
Index: 1
ComputeCapability: '7.0'
SupportsDouble: 1
DriverVersion: 11.2000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 3.4090e+10
AvailableMemory: 3.3167e+10
MultiprocessorCount: 80
ClockRateKHz: 1530000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceAvailable: 1
DeviceSelected: 1

3 Comments

gAs .* gBs looks to me to be highly unlikely to be symmetric.
Right. Thank you. Shouldn't this code throw the same error since the matrices are not symmetric: (in my case runs fine)
gAs = gpuArray.rand(999, 519);
gBs = gpuArray.rand(999, 519);
ifft2(gAs .* gBs, "symmetric");
Thanks again
Sorry, I would have to boot into a different operating system to test (GPU is not supported on my MacOS.)

Sign in to comment.

 Accepted Answer

Matt J
Matt J on 4 Jan 2022
Edited: Matt J on 4 Jan 2022

1 vote

I think you should probably just omit the 'symmetric' flag. On the GPU (mine at least), it doesn't seem to make a big difference in performance:
A = gpuArray.rand(512,512,512);
gputimeit(@() ifft2(A,'symmetric') ) % 0.0706 seconds
gputimeit(@() ifft2(A) ) % 0.0753 seconds
Whether this is an indication of sub-optimal software design on Mathworks part, I'm not sure. On the CPU, the 'symmetric' flag means the software does fewer flops, but on a parallel system like the GPU, it's not the number of flops that matters.

More Answers (1)

Matt J
Matt J on 3 Jan 2022
Edited: Matt J on 3 Jan 2022
I think it's a bug, but one solution might be,
fn=@(z,d) ifft(z,[],d,'symmetric');
out = fn( fn(gAs .* gBs,1) ,2);

2 Comments

Thanks for the answer. The code you provided is correct.
I have noticed though that very often there is a discrepancy between the results of the function handle fn and ifft2 even for 2 dimensional matrices when their sizes are greater than ~4. I created the following code snippet. If run multiple times sometimes it displays not equal .
clear all;
close all;
fn=@(z,d) ifft(z, [], d, "symmetric");
m = 5;
n = 4;
a = gpuArray.rand(m, n);
b = gpuArray.rand(m, n);
c = ifft2(a .* b, "symmetric");
d = fn(fn(a .* b, 1), 2);
if ~abs(c - d) <= eps(max(abs(c), abs(d)))
disp("not equal")
end
IIUC, are you suggesting that there is a bug in ifft2 when the symmetric flag is provided.
Matt J
Matt J on 4 Jan 2022
Edited: Matt J on 4 Jan 2022
It seems I had a conceptual error. ifft(ifft(X,1,'sym'),2,'sym') is not a valid replacement for ifft2(X,'sym') unless X is symmetric about both the x and y axes.
However, it does seem like a bug that only certain array sizes work for gpuArray.ifft2(). The CPU version of ifft2() doesn't have that problem.

Sign in to comment.

Products

Release

R2021b

Asked:

on 3 Jan 2022

Edited:

on 4 Jan 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!