Converting for loop to GPU friendly code/ Cross Validation with GPU

Hi does anyone have an idea how to convert the following for loop to GPU friendly code? I have tried to do so with arrayfun, crossval and fitrlinear functions. But to no avail. I have converted the objects to gpu arrays but im not sure how to carry out cross validation using the GPU, specifically without invoking the for loop. Thanks.
function [output] = cross_val(featMatGPU,labelMatGPU,foldGrouping)
G1 = unique(foldGrouping(:));
G = gpuArray(G1);
estimate=[];
gpuestimate = gpuArray(estimate);
pooledTargets =[];
gpupooledTargets = gpuArray(pooledTargets);
g = [];
gpug = gpuArray(g);
testingOrder = [];
gputestingOrder = gpuArray(testingOrder);
for i=1:length(G)
fprintf('\t\tFold %d of %d\n',i,length(G));
idx1 = find(foldGrouping~=G(i)); % Training set
idx2 = find(foldGrouping==G(i)); % Test set
gputestingOrder = [gputestingOrder, idx2(:)];
% train
w = featMatGPU(idx1,:) \ labelMatGPU(idx1);
% test
estimateTemp = featMatGPU(idx2,:)*w;
pooledTargets = [pooledTargets,labelMatGPU(idx2)];
estimate = [estimate,estimateTemp(:)];
end
end
Or rather, the question can be simplified to how do you perform leave one out cross validation for the code below in a GPU friendly manner:
featMatGPU = gpuArray(featMat);
labelMatGPU = gpuArray(labelMat)
function [results] = test_script_crossval(x,y)
[results] = x \ y(:,1);
end
results = test_script_crossval(featMatGPU,labelMatGPU)

10 Comments

Your loop looks extremely vectorizable. For a start you should be able to get your train and test indices out of unique, and use pagefun to do your mldivide operation with multiple system matrices. But it's hard to tell exactly what to do because I don't know what the input data is. What are the sizes and datatypes of the input arguments? Is foldGrouping a vector or a matrix?
Hi Joss,
Thanks for the reply. How do I get the train test indices out of unique? The foldGrouping is a 1x260253 matrix. Each 119 elements gives 1 fold. So theres a total o 2187 folds in the foldGrouping matrix. The featMatGPU and labelMatGPU are both GPU arrays with 260253 rows each. The labelMatGPU here is a single column and the featMatGPU is 101 columns. Every sequential 119 rows correspond to 1 fold.
The question I am facing is how should I vectorise the cross validation in terms of the train and test indices and create pages in my mldivide arrays? Thanks. An attempt to phrase the answer in code would really help me visualise what I need to do. Thanks!
Regards,
Joel
I have reduced the code to this by the way :
function [output] = cross_val(training,targets,foldGrouping)
G1 = unique(foldGrouping(:));
G = gpuArray(G1);
for i = length(G)
fprintf('\t\tFold %d of %d\n',i,length(G));
idx1 = foldGrouping~=G(i); % Training set
idx2 = foldGrouping==G(i); % Test set
C = pagefun(mldivide,training(idx1,:),targets(idx1));
end
output.C = C;
end
Do note that training, targets and foldGrouping are all GPU arrays.
Indexing a GPU array is not efficient. It would be more efficient to be testing G1(i) than G(i)
Your new code is even more confusing. Now you're indexing training in pagefun, whereas the whole point of pagefun is to operate on multiple matrices at once.
So, please confirm: training and target are 260252x260252 matrices? And foldGrouping is a 119x2187 matrix? Or is it a 1x260253 vector as you said?
I see. How should I use pagefun to operate on the 2187 folds then? Without using the index in G ? Training is a 260253 x 11 matrix. Each of the 11 columns represent a variable for regression and the 260253 is the number of training examples, of which there are 2187 cross validation folds (of 119 training examples each). Target is a 260253 x 1 column matrix storing the regression targets for each training example. foldGrouping is a 1x 260253 row matrix storing the fold number for each of the training examples. I just checked so this should be an accurate representation of my matrices. Thanks!
idx1 = foldGrouping~=G(i); % Training set
idx2 = foldGrouping==G(i); % Test set
C = pagefun(mldivide,training(idx1,:),targets(idx1));
You do not use idx2 after you build it. Should that third line be targets(idx2) ?
Hi Walter,
Thanks for the observation, but I am not sure how to fit in the test set into pagefun(mldivide). Any ideas?
I do not think it should be involved in this step. If I were to involve it, I would take the results of the mldivide and multiply them by training(idx2,:) and compare against targets(idx2) in order to determine how well the training did.
So, it looks like each matrix sent to mldivide is the same rows of training but with a different row removed each time, and similarly for targets.
[~, IG] = unique(foldGrouping);
A_all = training(IG,:);
B_all = targets(IG);
% Create a version of A and B replicated along dim 3, but
% with a different row removed from each page
numSolves = numel(IG);
numCols = size(A_all,2);
A = repmat(A_all,1,1,numSolves);
B = repmat(B_all,1,1,numSolves);
selectorB = reshape(~eye(numSolves),numSolves,1,[]);
selectorA = repmat(selectorB,1,numCols);
A = reshape(A(selectorA),numSolves-1,numCols,[]);
B = reshape(b(selectorB),numSolves-1,1,[]);
% Solve the multiple systems
C = pagefun(@mldivide,A,B);
However, the problem is that pagefun mldivide doesn't support rectangular matrices yet. You can get round this for now by using a pseudo-inverse solution (solving the normal equations). For a well-conditioned problem this shouldn't be too bad.
function X = pagefunMldivide(A,B)
% Solve AX = B for an under-determined problem in batch,
% via the normal equations:
% X = ((AA')A) \ (AA')B
At = pagefun(@transpose,A);
AAt = pagefun(@mtimes,A,At);
AAtA = pagefun(@mtimes,AAt,A);
AAtB = pagefun(@mtimes,AAt,B);
X = pagefun(@mldivide,AAtA,AAtB);
end
I haven't checked any of this so no doubt there are bugs.

Sign in to comment.

Answers (0)

Categories

Products

Asked:

on 3 Nov 2019

Edited:

on 6 Nov 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!