Optimizing several Gaussian Process models in parallel

7 views (last 30 days)
Dear community,
I have 80 datasets with 5000 datapoints each, that I want to fit a GP to. Is there any good way to parallelize this (HPC user here)?
Initially I would roughly realize this as follows: (it will run inside a batch script)
for i = 1:80
parfeval(p, @fitrgp, 1, X{i}, Y{i}, ...
'OptimizeHyperparameters', 'all',
'HyperparameterOptimizationOptions',...
struct(...
'MaxObjectiveEvaluations',500,...
'Optimizer', 'bayesopt',...
'Verbose', 0,...
'MaxTime', 60*60,...
'Repartition', true,...
'UseParallel', true,...
'Kfold' , 15))
end
% read results after it's finished
How does the 'Useparallel' option scale? Does it take effect at all if I let it run on a single worker? Is there any way that I can have multiple workers working for one fitrgp evaluation?
Best regards and thank you,
Robert
PS: I have up to ~500 cores available and I have up to 4 predictors.
  1 Comment
Robert
Robert on 25 Apr 2022
Which submit arguments for a batch job would make sense and get the most out of our computing ressources?
--ntasks=81 --cpus-per-task=5?

Sign in to comment.

Answers (1)

Ayush Anand
Ayush Anand on 20 Oct 2023
Hi Robert,
I understand you are trying to run several Gaussian Process models in parallel and want to know more about the “UseParallelargument, and if it is possible to have several workers working for one fitrgp evaluation.
You can use parallel computing to speed up the process of fitting a Gaussian Process (GP) to multiple datasets. The UseParallel option in MATLAB's fitrgpfunction parallelizes the cross-validation process when estimating the hyperparameters of the GP model, however it doesn't parallelize the fitting for a single GP model.
Here's how it works:
  • When you set UseParallel to true, MATLAB uses parallel computing to perform multiple cross-validation folds simultaneously. Each worker is responsible for one or more folds.
  • The UseParallel option doesn't have an effect when you're running the function on a single worker. It's specifically designed to take advantage of multiple workers.
  • You can't use multiple workers for a single fitrgp evaluation. The UseParallel option only parallelizes the cross-validation process within a single fitrgp call, not the fitting process itself.
You can refer to the following page for more information on the fitrgp” function and the “UseParallel” argument:
I hope this helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!