How are iterations assigned to workers in parfor?
8 views (last 30 days)
Show older comments
I am currently using parfor to process multiple raw data files, in the statement, it first checks if the raw file have already been processed, and only process if it does not see an existing output, like this:
RawDatalist=dir(fullfile(RawDataFolder,'*.txt'));
NumRawData=length(RawDatalist);
parfor i =1:NumRawData
if %output for RawDatalist(i).name already exist
Execute=false;
else
Execute=true;
end
if Execute
%Process RawDatalist(i).name
end
end
Obviously some iterations will take less time than others because there is no calculation involved. I am just wondering if iterations are 1)devided among works at the start of parfor, or 2)handed out one by one once a worker become available? If it's the first case then some workers will be just sitting idle while some others busy working, and I need to move the existance check out of the parfor loop.
2 Comments
Mohammad Sami
on 8 Jan 2020
Edited: Mohammad Sami
on 8 Jan 2020
Please see here. Your code block in parfor should be independent of other iterations. This will ensure correct parallel execution. https://www.mathworks.com/help/releases/R2019b/parallel-computing/decide-when-to-use-parfor.html
Each execution of the body of a parfor-loop is an iteration. MATLAB workers evaluate iterations in no particular order and independently of each other. Because each iteration is independent, there is no guarantee that the iterations are synchronized in any way, nor is there any need for this. If the number of workers is equal to the number of loop iterations, each worker performs one iteration of the loop. If there are more iterations than workers, some workers perform more than one loop iteration; in this case, a worker might receive multiple iterations at once to reduce communication time.
Accepted Answer
Edric Ellis
on 8 Jan 2020
As @Mohammad already commented, the parfor implementation automatically divides up the iterations of the loop onto the workers. Since R2019a, you can have some control over this division using parforOptions. The default division works well in most situations, even when the loop iterations do not take equal amounts of time. However, if there is a large imbalance, the division might not work well, and it may indeed be worth pre-computing which iterations need real work to be done.
4 Comments
Isidro Losada López
on 31 Jul 2020
Hi,
One question related to that: I'm trying to run a parfor loop with more iterations than workers are available in my cluster. The thing is that, some of the iterations are much more expensive than others, so it would be better to calculate first the heavy iterations than the lighter ones. As you told above, there is a way in Matlab/R2019a, parforOptions. The problem is that I'm using Matlab/R2014b hahaha. So, the order of the iterations are totally random or Matlab has any pattern to decide the order?
For example, I have a set of different molecules and their information in a cell array. Every component is an structure with atomic positions, atomic numbers, etc. So, every iteration is going to find the energy and their derivatives and store them in the same cell array. But because some molecules are bigger than others, it would be better to calculate first the biggest molecules. Here I show you a dummy code to illustrate what I'm trying to say:
nw = 20; % number of cores
nm = 40; % number of molecules
myCluster = parcluster('local');
myParpool = parpool(myCluster,nw); % start parallel pool
mol = cell(1,nm);
parfor im = 1:nm
mol{im} = energyFunction(mol{im});
end
What could I do to force Matlab to follow some especific order when calculating iterations? Maybe changing the order of the molecules in cell array "mol" could change something? Or for this version the order is totally random?
Thanks in advance!
Edric Ellis
on 31 Jul 2020
If you know which computations are likely to take longest, you can force the ordering by using parfeval and initiating those computations first.
More Answers (0)
See Also
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!