# Parallel calculations on matlab

3 views (last 30 days)
emar on 31 Jul 2017
Commented: emar on 1 Aug 2017
Hello,
As discussed in other posts, I understood that some functions are parallelized in Matlab, so they are fast in serial calculations ( like normxcorr2 which uses FFT functions and conv2).
I used this code
parfor( i=1:1000,20) % To do calculations on 20 cores only
y1= function_1(image1, template,....)
. % other lines
.
.
end
If inside function_1 I have FFT functions (like conv2 fft2 ,etc) and I allocate to this calculation the remaining cores
function y= function_1(image1,template,...)
parfor( i=1:41,24) % I allocated 24 cores to the child function
t=template{i};
correlation{i}=normxcorr2(image1,t),
%etc
end
%etc
end
Why when I open the task manager , in the section " performance" the percent of memory used is only 40% ?
I thought that when i have 44 cores and I allocate 20 to the global calculation and 24 to the functions inside , it will be 99% of memory used.
I don't know much about the allocation of cores in a parfor, but if you have nested loops then it seems you are telling it to use 24 cores for each of the inner loops, of which there are potentially 20 running concurrently in cores which would be equivalent of expecting it to use 480 cores.

Alan Weiss on 31 Jul 2017
As documented, nested parfor loops do not run in parallel. There is a graphical depiction of which loop runs, in the context of optimization solvers, here.
Alan Weiss
MATLAB mathematical toolbox documentation
emar on 1 Aug 2017

Cong Ba on 31 Jul 2017
Edited: Cong Ba on 31 Jul 2017
Memory usage is actually irrelevant here. You actually wanted to look at CPU usage. And this question is actually more relevant to computer architecture instead of the MATLAB software. The fact is, the OS will find the optimal way to assign the work to the cores.
And more on your specific question, what if you assign 20 cores for the outer parfor and 24 for the inner parfor? The answer is, the cpu usage is still managed by your OS and it seems very likely the cpu won't run at 100%, or 99%.
A possibility is, assume one of the outer 20 cores gets into the inner parfor first, it may start using the inner 24 cores. The other 19 outer cores, although finished their work, could not call each other and must wait for the first guy to finish, and during which these 19 cores will just idle.
Not sure if this makes sense to you.
And, if possible, you may try to use all cores in the outer parfor, get rid of the inner parfor and compare the efficiency. I'm very interested to see the results.
##### 3 CommentsShow 1 older commentHide 1 older comment
Memory usage can be very hard to keep on top of with that many cores. As Cong Ba says, it is CPU usage that shows the utilisation, but data gets copied to every core and, depending on how much data you have you can easily blow the maximum memory usage from the duplication of data.
emar on 1 Aug 2017