Efficiently run matrix or vector-valued function in element-wise fashion on GPU?

I have a function like
function [a b] = fun(x)
that takes in a scalar x and outputs two Tx1 vectors, a and b.
I'd like to call this function on every element of a large MxN matrix X, and get two TxMxN multi-dimensional arrays A and B, such that [A(:,m,n), B(:,m,n)] = fun(X(m,n)), for m=1,...,M and n=1,...,N.
arrayfun won't work here since this is not a scalar function (and UniformOutput=false doesn't work on GPU).
What are my options? I can also rewrite fun to return a single vector [a;b] or a Tx2 matrix [a b] instead, then rejigger the results myself, if that makes the problem any easier. Anyway the basic problem here is achieving something like arrayfun on GPU for a matrix-valued function (that takes scalar inputs). If there's no MATLAB functionality for this, I'd appreciate advice on how to manually code this in CUDA.

Answers (1)

It's difficult to say what the best solution is without seeing what fun does. Typically you can address it using a series of calls to pagefun. If T is a manageable size, you can write out the vector operations in scalar mathematics, and create a series of T outputs of size M-by-N. For instance.
function [a1, a2, a3] = polyvals(x)
a1 = x;
a2 = x*x;
a3 = a2*x;
end
[A1, A2, A3] = arrayfun(@polyvals, X);
A = cat(3, A1, A2, A3);
Then you could permute the result if you really wanted it to be 3-by-M-by-N rather than M-by-N-by-3. However, this may not be feasible for your problem.

4 Comments

Thanks a lot for your answer!
I simplified my problem a bit in the original question; the function actually takes T as a parameter, like
function [a b] = fun(T,x,y)
so the length of a and b is not known in advance. I understand that to use arrayfun, I essentially need to spell out all the return values, unpacking the elements of a and b, something like:
function [a1 a2 ... aT b1 b2 ... bT] = fun(T, x, y)
then collect all the 2T results.
I tried returning [varargout] (I manually fill it with [a;b]); it works with arrayfun on CPU, but on GPU I get
Error using gpuArray/arrayfun
Variable number of outputs is not supported.
The function I'm actually using can be found here: gaussjac. It computes Gauss-Jacobi quadrature nodes and weights (what I called a and b), given the number of quadrature points (what I called T) and two scalar parameters (what I called x and y). I'd like to compute (the same number of) quadrature nodes and weights for multiple parameters in parallel, along the lines of [A, B] = arrayfun(@gaussjac, T*ones(size(X)), X, Y)
It doesn't look like what you are doing is conducive to use of GPU arrayfun. GPU arrayfun compiles in the size of the inputs and outputs so can't flexibly adapt at runtime. You'd need to pick a T and write a version for each T. Even then you might find that gaussjac contains too many unsupported constructs. I can see it does indexing, which is only possible if you are indexing into an up-level variable; also you cannot do index-assignment in arrayfun.
You might be better off going a different route here, trying your best to vectorize the code so that it is operating on as many elements at once as possible.
Thanks again for your insight Joss!
Now I understand better the restrictions when using gpuArray with arrayfun. Since you likely have more experience with CUDA than me, do you think writing a custom CUDA kernel and calling it from MATLAB would work? It'll be a pain to write a different function for each T (it'd be great if MATLAB has something like C++ templates and can generate/compile a different gaussjac for each T).
Otherwise, as you suggested, I probably should look into vectorization or a different algorithm.
Anything you can do in MATLAB, you can do in C++. The question is of course, can you do it easily? On the face of it your gaussjac function doesn't look too hard to convert into C++ because it's mostly loops and scalar arithmetic, but you'd probably have to give it a try to assess that for sure. And only you know what your level is for CUDA C++.

Sign in to comment.

Asked:

on 7 Nov 2018

Commented:

on 8 Nov 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!