Is anyone able to run gpu-based conv2 on a GTX 1080 or other pascal hardware?

Nick Chng

29 Sep 2016

0 Answers

Updated 6 Oct 2016

5 Views (30 days)

Follow Question

You are now following this question

You will see updates in your followed content feed.
You may receive emails, depending on your communication preferences.

An Error Occurred

Unable to complete the action because of changes made to the page. Reload the page to see its updated state.

Show older comments

0 votes

Translate

Share a link to this question

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware

I repeatedly get unspecified CUDA launch errors, despite the function calculating correctly on older Maxwell hardware using the same datasets. Can anyone else reproduce this?

Thanks, Nick

12 Comments
Show 10 older comments Hide 10 older comments

Nick Chng on 29 Sep 2016

Edited: Walter Roberson on 29 Sep 2016

Open in MATLAB Online

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_394637

For example:

A = gpuArray.ones(100, 'single');
B = gpuArray.ones(50, 'single');
A(1:10,1:10) = conv2(A(1:10,1:10), B, 'same');
Error using gpuArray/gather
An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_LAUNCH_FAILED
Error in dispInternal>iTransferPortionDense (line 36)
data = gather( subsref( x, s ) );
Error in parallel.internal.shared.buildDisplayHelper>iFirstNNumericDisplayHelper (line 72)
        maybeTruncatedValue = transferDenseFcn( x, rangeStruct );
Error in parallel.internal.shared.buildDisplayHelper>iBuildDisplayHelper (line 33)
    dh = iFirstNNumericDisplayHelper( ...
Error in parallel.internal.shared.buildDisplayHelper (line 24)
    dh = iBuildDisplayHelper( x, transferDenseFcn, transferSparseFcn, xClassName, xName, N );
Error in dispInternal (line 13)
    dh = parallel.internal.shared.buildDisplayHelper( ...
Error in gpuArray/display (line 21)
dh = dispInternal( obj, thisClassName, objName );

Joss Knight on 29 Sep 2016

Open in MATLAB Online

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_394701

This works for me on the GTX 1080.

Your error appears to be in display, not in the convolution. Did you definitely enter the command with a semicolon at the end to suppress display? What happens when you call

Ac = gather(A);

? And what version of MATLAB are you using?

Nick Chng on 29 Sep 2016

Edited: Nick Chng on 29 Sep 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_394848

Hi Joss, thanks for your response.

I see this behaviour on R2015b, R2016a, and R2016b.

The error is not always reproducible. After a fresh matlab restart, it's often possible to run a few convolutions as above without error. However, once I start running my program it will crash.

I'm not sure it's the display that's causing it, per se, but that the kernel is only invoked once the data it calculates is requested. I can also generate the error by use the kernel result in another calculation.

I also want to point out that this is the same error I see in matlab when invoking a self-written kernel that has an indexing error. But, if this were the case I can't see why the same program should run fine on Maxwell and error on Pascal.

Anyway, I'll keep trying to find a code snippet that will guarantee an error and post it here. In the meantime I've rewritten the conv2 as an fft and worked around the issue :)

Joss Knight on 30 Sep 2016

Edited: Joss Knight on 30 Sep 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_394996

Are you using your GPU for display? Are these kernels long-running? It's possible Windows is timing-out your display. You might need to disable Timeout Detection and Recovery: https://msdn.microsoft.com/en-us/library/windows/hardware/ff570087(v=vs.85).aspx

I don't know what you mean by kernels only being invoked when the data is requested. This kind of lazy evaluation doesn't happen with conv. However, not all runtime errors can be picked up by MATLAB and may be reported as launch failures on the next line of GPU code.

Nick Chng on 1 Oct 2016

Edited: Nick Chng on 1 Oct 2016

Open in MATLAB Online

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_395078

Apologies, you may be right about the execution for conv, I was thinking about how my other parallel.gpu.CUDAKernels behave. However, it's definitely not the WDDM timeout. This is set to more than 20s on my machine, and the Maxwell card takes << 1s to calculate these.

Here, this code snippet creates the "CUDA_ERROR_ILLEGAL_ADDRESS" during gather with a GTX 1080 selected as the current GPU every time. Configuration is:

R2016b (Student) Intel 6950 32 GB RAM, 2x GTX 1080 1x (Maxwell) TITAN X. Windows 10

A = gpuArray.ones(100, 'single');
B = gpuArray.ones(10, 'single');
for c = 1:100
    C = conv2(A,B, 'same')
    C(1)
end;

Joss Knight on 3 Oct 2016

Edited: Joss Knight on 3 Oct 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_395510

I ran this a number of times on my GTX 1080 with no errors. It could be a Windows 10 display issue. A couple of questions:

Are you definitely running on a compute card that isn't running the display? You appear to be saying you're running on your Titan X, which, being lower-performing than the GTX 1080, I'm guessing you have attached to the display? What is the output of gpuDevice?
Does line 5 have to have no semicolon at the end? What if you put a semicolon at the end to suppress display? If it doesn't error any more, what happens if you put C = gather(C); after the convolution?

Nick Chng on 3 Oct 2016

Edited: Nick Chng on 4 Oct 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_395615

1) Yes. One 1080 runs the display, but it fails on both. gpuDevice returns normally for all cards.

2) It fails with or without suppressing the output.

Still, it's great to know that in principle there are platforms that don't have this issue. I am going to reinstall my graphics drivers and CUDA toolkit and then try to minimize the number of background processes and services to see if that helps.

Update: Re-installed CUDA Toolkit and Driver, shut down all non-essential background programs and still having problems.

Joss Knight on 4 Oct 2016

Edited: Joss Knight on 4 Oct 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_395760

The CUDA toolkit only affects your own MEX functions, it has nothing to do with MATLAB's own kernels or your kernels run using the CUDAKernel class. The driver it comes with could be an issue though - you may fair better installing the driver straight out of NVIDIA's driver downloads page, whichever is latest for your device. I for instance am running 367.44 on my GTX 1080s (but that's under linux, the version number will be different for Windows).

What I mean by the output of gpuDevice is that it would help to see what MATLAB displays when you call gpuDevice.

Can you confirm again that the above code is the first thing you run after starting MATLAB, and you haven't run any of your own CUDA kernels.

Nick Chng on 5 Oct 2016

Open in MATLAB Online

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_396027

                Name: 'GeForce GTX 1080'
                     Index: 1
         ComputeCapability: '6.1'
            SupportsDouble: 1
             DriverVersion: 8
            ToolkitVersion: 7.5000
        MaxThreadsPerBlock: 1024
          MaxShmemPerBlock: 49152
        MaxThreadBlockSize: [1024 1024 64]
               MaxGridSize: [2.1475e+09 65535 65535]
                 SIMDWidth: 32
               TotalMemory: 8.5899e+09
           AvailableMemory: 7.0585e+09
       MultiprocessorCount: 20
              ClockRateKHz: 1733500
               ComputeMode: 'Default'
      GPUOverlapsTransfers: 1
    KernelExecutionTimeout: 1
          CanMapHostMemory: 1
           DeviceSupported: 1
            DeviceSelected: 1

Yes. Above code is first thing run.

Joss Knight on 6 Oct 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_396456

Well, there are no obvious problem there, except that KernelExecutionTimeout is 1. But that is also true on my machine and I have no issues.

I'm going to get someone with Windows and a GTX 1080 to test your code, and then I may have to move you over to tech support. Meanwhile you should try some different legacy drivers listed at http://www.nvidia.com/Download/Find.aspx?lang=en-us.

Joss Knight on 6 Oct 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_396492

Your issue reproduces on a GTX 1080 on Windows. Thanks for reporting. It will take a little while to investigate this.

Nick Chng on 6 Oct 2016

Translate

Share a link to this comment

https://se.mathworks.com/matlabcentral/answers/305000-is-anyone-able-to-run-gpu-based-conv2-on-a-gtx-1080-or-other-pascal-hardware#comment_396598

No problem, like I said above the ifft(fft*fft) approach works and the performance hit isn't that severe to my application. I'm mostly glad I can stop pulling my hair out trying to figure out what I've misconfigured. Thanks for being so responsive.

Follow Question

Is anyone able to run gpu-based conv2 on a GTX 1080 or other pascal hardware?

12 Comments
Show 10 older comments Hide 10 older comments

Answers (0)

Categories

Tags

Community Treasure Hunt

Is anyone able to run gpu-based conv2 on a GTX 1080 or other pascal hardware?

12 Comments Show 10 older comments Hide 10 older comments

Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

12 Comments
Show 10 older comments Hide 10 older comments