Using parfor eliminates use of multithreading

9 views (last 30 days)
I am using fmincon to fit a simulation to some experimental data. I am running my program on a Linux machine that has 20 cores. Within my objective function I make 8 calls to a function that runs my Forward Time Centered Space finite difference simulation. This is how I get my set of 8 simulation data points for the current set of design variables within fmincon. My cost function is then just the mean squared error of the experimental and simulation data points.
The function that runs my FTCS simulation is somewhat computationally expensive, so to speed things up I am using parfor in a pool with 8 workers to run all 8 simulations at once. However, using the "htop" command in Linux, I can see that parfor limits each simulation to a single core, with each core running at 100%.
Alternatively, if I just run the simulation outside of fmincon, running six different MATLAB jobs by entering the following into the Linux command line:
(
module load matlab; ulimit -u 8192; nohup matlab -nodisplay -nosplash -r "datestr(now), P = 20, V = 50, run('/folder/PolymerCode/FunctionSim_run_PIdiff.m');" > slot1.txt &
nohup matlab -nodisplay -nosplash -r "datestr(now), P = 20, V = 80, run('/folder/PolymerCode/FunctionSim_run_PIdiff.m');" > slot2.txt &
nohup matlab -nodisplay -nosplash -r "datestr(now), P = 20, V = 100, run('/folder/PolymerCode/FunctionSim_run_PIdiff.m');" > slot3.txt &
nohup matlab -nodisplay -nosplash -r "datestr(now), P = 20, V = 150, run('/folder/PolymerCode/FunctionSim_run_PIdiff.m');" > slot4.txt &
nohup matlab -nodisplay -nosplash -r "datestr(now), P = 20, V = 200, run('/folder/PolymerCode/FunctionSim_run_PIdiff.m');" > slot5.txt &
nohup matlab -nodisplay -nosplash -r "datestr(now), P = 20, V = 250, run('/folder/PolymerCode/FunctionSim_run_PIdiff.m');" > slot6.txt &
)
where FunctionSim_run_PIdiff.m is the same simulation I am calling 8 times within fmincon, I can see that all 20 cores of the machine are being used somewhat uniformly at around 80%.
Is there a way I can get these 8 calls to my simulation function within fmincon to use more multithreading while still using parfor? I have tried manually setting the max number of computational threads using
maxNumCompThreads(20)
but this had no effect.
  6 Comments
Matt J
Matt J on 11 Oct 2021
And with batch() and parFeval()? Same thing?
Jason Johnson
Jason Johnson on 11 Oct 2021
parfeval() results in the same thing. My use and knowledge of batch() is limited, but if I am using it correctly it is also resulting in the same thing. It seems like for all of these it boils down to each run of the simulation is allocated to one worker and each worker is using a single thread.

Sign in to comment.

Accepted Answer

Raymond Norris
Raymond Norris on 11 Oct 2021
Check the cluster objects' NumThreads property. For instance
local = parcluster('local');
local.NumThreads = <set-number-of-threads-for-each-worker-to-use>;
pool = local.parpool( <set-pool-size> );
Setting NumThreads will automatically set maxNumThreads on the workers.
  2 Comments
Jason Johnson
Jason Johnson on 11 Oct 2021
This did the trick! So simple... Sometimes the documentation for the Parallel Computing toolbox can be so convoluted. Thank you for your help.
Raymond Norris
Raymond Norris on 12 Oct 2021
Thanks for the feedback, Jason. I've passed your comment onto our Documentation.

Sign in to comment.

More Answers (1)

Matt J
Matt J on 11 Oct 2021
Edited: Matt J on 11 Oct 2021
If you are getting the best performance from the Linux command line, perhaps a solution would be to invoke the Linux command line from within Matlab. You could do that with the system() command.
in combination with parFeval.

Categories

Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!