numerical instabilites for GPU results

Question

0 votes

I run this code

T=randn(10000,64);
data=randn(1000,64,10);
Tg=gpuArray(T);
datag=gpuArray(data);
res=zeros(10000,1000);
resg=gpuArray(res);
for i=1:10
    res=res+T*data(:,:,i)';
end
for i=1:10
    resg=resg+Tg*datag(:,:,i)';
end
resg=gather(resg);
norm(res-resg,'fro')/norm(res,'fro')

where I would expect "res" (CPU comptuted) and "resg" (GPU computed) to be the same, but they are not.

I am running this on a Tesla Card, i.e.

gpuDevice

ans =

parallel.gpu.CUDADevice handle
Package: parallel.gpu
Properties:
                    Name: 'Tesla C1060'
                   Index: 1
       ComputeCapability: '1.3'
          SupportsDouble: 1
           DriverVersion: 3.2000
      MaxThreadsPerBlock: 512
        MaxShmemPerBlock: 16384
      MaxThreadBlockSize: [512 512 64]
             MaxGridSize: [65535 65535]
               SIMDWidth: 32
             TotalMemory: 4.2948e+09
              FreeMemory: 4.0671e+09
     MultiprocessorCount: 30
             ComputeMode: 'Default'
    GPUOverlapsTransfers: 1
  KernelExecutionTimeout: 0
        CanMapHostMemory: 1
         DeviceSupported: 1
          DeviceSelected: 1
Methods, Events, Superclasses

3 Comments
Show 1 older comment Hide 1 older comment

Felix on 18 May 2011

There are large numerical differences, i.e.norm(res-resg,'fro')/norm(res,'fro') returns something on the order of 1e234. These are clearly no subtle BLAS differences. I suspect there is something wrong when moving data between the CPU and the GPU?

Gaszton on 19 May 2011

I runned the code on my gt425m:

ans =

2.4946e-016

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Felix on 20 May 2011

0 votes

I upgraded to the latest drivers

270.41.19

, which seems to have fixed the problem.

1 Comment
Show -1 older comments Hide -1 older comments

James Tursa on 20 May 2011

FYI, it is bad form to accept your own answer when Edric was the one that suggested updating your drivers.

Sign in to comment.

Answer 2

Edric Ellis on 19 May 2011

Open in MATLAB Online

2 votes

I've just run this using R2011a on Linux and Windows using C1060 cards, and in each case the final "norm" calculation gives a result of around 2e-16. So, this should work! Could you post the output of running

parallel.internal.gpu.CUDADriverVersion

and

ver distcomp

4 Comments
Show 2 older comments Hide 2 older comments

Felix on 20 May 2011

what is your driver version?

When I run this:

T=randn(10000,64);

A=randn(1000,64);

Ag=gpuArray(A);

Tg=gpuArray(T);

res=gather(Tg*Ag');

norm(res-T*A','fro')/norm(T*A','fro')

I get ~1e-16 at first and ~0.05 on repeated runs, so there is a problem in the matrix mult.

Sean de Wolski on 14 Mar 2012

Copying Felix' first post with license censored:

Here it is:

parallel.internal.gpu.CUDADriverVersion

ans =

260.19.26

ver distcomp

-------------------------------------------------------------------------------------

MATLAB Version 7.12.0.635 (R2011a)

MATLAB License Number: ############

Operating System: Linux 2.6.30.10-105.2.23.fc11.x86_64 #1 SMP Thu Feb 11 07:06:34 UTC 2010 x86_64

Java VM Version: Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode

-------------------------------------------------------------------------------------

Parallel Computing Toolbox Version 5.1 (R2011a)

Sign in to comment.

numerical instabilites for GPU results

3 Comments
Show 1 older comment Hide 1 older comment

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (1)

4 Comments
Show 2 older comments Hide 2 older comments

Categories

Products

Tags

Community Treasure Hunt

numerical instabilites for GPU results

3 Comments Show 1 older comment Hide 1 older comment

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (1)

4 Comments Show 2 older comments Hide 2 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

1 Comment
Show -1 older comments Hide -1 older comments

4 Comments
Show 2 older comments Hide 2 older comments