No parallel computing when using parfor
3 views (last 30 days)
Show older comments
Hello,
I have some code that uses parfor to use parallel computing. The code does not give any error and runs well, provides the right output, etc. However it is important to do parallel computing as the calculation takes quite a bit of time (currently 2 minutes each iteration), and needs to be run thousands of times.
In terms of context:
- temperature_v9 is the function that does the true analysis, takes as inputs temperature measurements and targets and develops some predictions. The inputs have tens of millions of rows
- temperature_wrapper_v9 gets the same inputs but split in tranches (the split is done with another function) and then runs temperature_v9 for each tranch. the idea is to do it in parallel to speed up time. For example to split the data in 10 tranches, run 10 instances of temperature_v9 in parallel, and then concatenate the results of the 10 instances at the end
- Both approaches give the same results
- The second approach does not paralellize de facto, and the processing time is slightly higher than with the first approach
- In both approaches only one core is at 100% and 11 cores at at very low load
- In both approaches there is plenty of RAM memory available
- I have used the profiler and the time is spent in many different tasks, there is nothing above 5%-10%. The 2 biggest activities are 2 calls at the std function, which I can not avoid. So I want to focus on solving the problem by parallel computing if possible.
Can anyone shed some light on how to paralelize this calculation? (code pasted below), I must be missing something here.
Thanks in advance,
Joe
function [statisticTemp, totalTemp, tempFunction, AccTempFunction, PF, increasePercent, avgInstantTemp, instantsNumber, decreaseFunction, increaseFunction, PercentTempFunction] = temperature_wrapper_v9(lowTempSizeMatrixTranches, lowTempMatrixTranches, highTempSizeMatrixTranches, highTempMatrixTranches, trueMidPointsTranches, trueLowsTranches, trueHighsTranches, trueSpreadsTranches, window, refreshRate, expectedIncrease, depthOfMeasure, numDevMaxEntry, numDevMinEntry, numDevExit, changeMin, changeMax, numDevMinSpread, maxSpread, alpha, SLT, print, graph)
sizeData=size(lowTempMatrixTranches);
tranches=sizeData(1,1);
statisticTempTranches= zeros(tranches, 1);
totalTempTranches= zeros(tranches, 1);
tempFunctionTranches= zeros(tranches, sizeData(1,2));
AccTempFunctionTranches= zeros(tranches, sizeData(1,2));
PFTranches= zeros(tranches, 1);
increasePercentTranches= zeros(tranches, 1);
avgInstantTempTranches= zeros(tranches, 1);
instantsNumberTranches= zeros(tranches, 1);
decreaseFunctionTranches= zeros(tranches, sizeData(1,2));
increaseFunctionTranches= zeros(tranches, sizeData(1,2));
PercentTempFunctionTranches= zeros(tranches, sizeData(1,2));
statisticTempAux= 0;
totalTempAux= 0;
tempFunctionAux= zeros(1, sizeData(1,2));
AccTempFunctionAux= zeros(1, sizeData(1,2));
PFAux= 0;
increasePercentAux= 0;
avgInstantTempAux= 0;
instantsNumberAux= 0;
decreaseFunctionAux= zeros(1, sizeData(1,2));
increaseFunctionAux= zeros(1, sizeData(1,2));
PercentTempFunctionAux= zeros(1, sizeData(1,2));
parfor i=1:tranches
[statisticTempAux, totalTempAux, tempFunctionAux, AccTempFunctionAux, PFAux, increasePercentAux, avgInstantTempAux, instantsNumberAux, decreaseFunctionAux, increaseFunctionAux, PercentTempFunctionAux] = temperature_v9(squeeze(lowTempSizeMatrixTranches(i,:,:)), squeeze(lowTempMatrixTranches(i,:,:)), squeeze(highTempSizeMatrixTranches(i,:,:)), squeeze(highTempMatrixTranches(i,:,:)), squeeze(trueMidPointsTranches(:,i)), squeeze(trueLowsTranches(:,i)), squeeze(trueHighsTranches(:,i)), squeeze(trueSpreadsTranches(:,i)), window, refreshRate, expectedIncrease, depthOfMeasure, numDevMaxEntry, numDevMinEntry, numDevExit, changeMin, changeMax, numDevMinSpread, maxSpread, alpha, SLT, 0, 0);
statisticTempTranches (i,:)=statisticTempAux;
totalTempTranches (i,:)=totalTempAux;
tempFunctionTranches (i,:)=tempFunctionAux;
AccTempFunctionTranches (i,:)=AccTempFunctionAux;
PFTranches (i,:)=PFAux;
increasePercentTranches (i,:)=increasePercentAux;
avgInstantTempTranches (i,:)=avgInstantTempAux;
instantsNumberTranches (i,:)=instantsNumberAux;
decreaseFunctionTranches (i,:)=decreaseFunctionAux;
increaseFunctionTranches (i,:)=increaseFunctionAux;
PercentTempFunctionTranches (i,:)=PercentTempFunctionAux;
end
%Postprocessing: I reassemble all the outputs of the different tranches in a single one
2 Comments
Walter Roberson
on 10 Nov 2013
Could you re-arrange the order of the dimensions for lowTempSizeMatrixTranches ? Perhaps at an outer level? And also for your other variables?
permute(lowTempSizeMatrixTranches, [2 3 1])
that would set things up so you index by the third dimension, making each slice into contiguous memory and removing the need for the squeeze().
When practical, index by the last dimension instead of the first.
Answers (1)
See Also
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!