Unexpected high memory usage in MATLAB.

84 views (last 30 days)
Hello,
I am a PhD student, and I am running a Matlab script in a remote computer having 400Gb of RAM. In the script there is loop for of 200 iterations. For each iteartion, heavy calculation are made.
Before to start each iteration I clear the largest variables employed in the previous iteration, and the result is stored in a cell of a preallocated cell array (preallocated outside of the for).
At the end of the script, the cellarray occupies about 5Gb of RAM.
However, when I run the command "top" on Ubuntu command line, the total memory usage by the MATLAB application is 280Gb.
In fact, if I try to use the same computer for other works it easily goes out of memory.
How can be possibile this huge amount of memory usage?
Any kind of help would be really appreciated, thanks!!
*EDIT*
I tried to reduce the problem and check the memory usage, which increases as well in a non-expected way.
In Ubuntu: I used the command "top" to check the RAM and the CPU usage, and I have found a huge memory usage: VIRT = 260gb, RES = 230 Gb, CPU = 2001%.
In MATLAB I used the command "whos" to check the memory usage, and the total allocated memory was about 5Gb. I used the commands "clear all" "close all" "clc" to clear the workspace. Later I checked again in Ubuntu and still the memory was just reduced of those 5Gb, which means VIRT > 250gb, RES > 220 Gb.
How can this be possible? Given that the calculations are made mainly in a function called in the main script, is it possible that the local memory allocated by the called function goes to increase the memory usage even after the function is closed?
I generate a DATASET that is a cell array. Each cell is a struct where I save the result of some "heavy" calculations that are made by a function that is called in the loop for. Below I write a summary of the code.
---------------------------------------------------------------------------------------------------------------------------------
MATLAB code summary:
DATASET = cell(Nsamp,1);
for isamp = 1:Nsamp
DATASET{isamp}.a1 = ...
DATASET{isamp}.a2 = ...
DATASET{isamp}.a3 = ...
.
.
DATASET{isamp}.aN = ...
%%%% heavy_calculations made in a called function
Result_tmp = function(DATASET{isamp});
DATASET{isamp}.Result = Result_tmp;
end
save('DATASET');
---------------------------------------------------------------------------------------------------------------------------------
  1 Comment
Bruno Luong
Bruno Luong on 20 Jan 2023
May be the virtual memory is needed when your function performs the "very complicated" calculation. And I'm not sure if MATLAB or any software can reclaims the virtual memory by CLEAR. May be OS control it and it will be freed when it only needs?

Sign in to comment.

Accepted Answer

Jan
Jan on 20 Jan 2023
Edited: Jan on 20 Jan 2023
Avoid clear all, because it has no benefits. It removes all loaded functions from the RAM and reloads them from the slow disk the next time they are called. This wastes a lot of time.
The brute clearing header clear all; close all; clc; might be standard for dirty scripts of beginners, but it should not occur in productive code.
Clearing variables is rarely useful in Matlab. Prefer to perform the calculations in functions, not in scripts. This let Matlab cleanup unused memopry autoamtically.
What does top exactly show? If Matlab frees temporarily allocated memory, it can be cheaper not to release it immediately. The memory is available after the OS has overwritten it with zeros. This needs time so it is done on demand only. Maybe the large amount of memory is usable by other applications, but as long as it is not requested, the OS is lazy and leaves it marked as "used by Matlab".
Care for a proper pre-allocation. The posted code seems fine, but maybe the problem is hidden in the not shown parts. A standard example:
x = [];
for k = 1:1e6
x(k) = fcn(k);
end
This creates a new array in each iteration and copies the former contents. Therefore this requests sum(1:1e6)*8 byte from the OS, which is more than 4 TB, although the final array needs 8 MB only. The solution is easy (sorry, I assume you are familiar with this, but maybe other readers can profit):
x = zeros(1, 1e6);
for k = 1:1e6
x(k) = fcn(k);
end
Now Matlab reserves 8 MB and does not use more memory for x.
A missing pre-allocation can cause the huge amount of occupied but not used RAM, because it depends on the memory management of Matlab and the OS, when the memory is available again.
  1 Comment
Andrea Mazzolani
Andrea Mazzolani on 20 Jan 2023
Edited: Andrea Mazzolani on 20 Jan 2023
thank yout for your answer!
I preallocate memory for each arrray. There is only one array that I declare in the second way, but it is very small and the iterations are 28 in total.
Furthermore, I think the memory allocated should be released after I used "clear all"?
For what concerns the "top" command and the time that MATLAB takes to release the memory, I cleared all variables yesterday night and I checked the memory usage with "top" command this morning, and it was still more than 250Gb.
I attach a screenshot of what it shows at the 10th iteration, where VIRT = 91.9gb, RES = 60.7 Gb.

Sign in to comment.

More Answers (1)

dpb
dpb on 18 Jan 2023
Moved: dpb on 18 Jan 2023
Absolutely no way anybody here can diagnose something without seeing the code -- it's quite possible your code has an inadvertent array reference that is a very big index that will accidentally allocate such an array. Note that
>> clear a
>> ix=resultOfSomeOperationResultingIn10;
>> a(ix)=pi
a =
0 0 0 0 0 0 0 0 0 3.1416
>>
created the full array...if the resultant ix array in reality were a very big number, then a could be taking up a large chunk of memory.
Also, remember that builtin functions like zeros(N) build 2D arrays by default; missing an intended "1" in a row or column position will build the 2D array instead of intended vector. Similarly, if N is large, there's a big array.
Your algorithm may be doing anything; we have no way to even guess how it might be creating additional demands on memory. If it's also plotting results during these iterations of some of these "heavy" calculations, it may be you're creating huge arrays of graphic object handles inadvertently.
"Before to start each iteration I clear the largest variables employed in the previous iteration..."
In general clear is not needed and will not result in returning memory to the OS during execution, anyway. If your application is redefining these arrays in their entirety each iteration, then it will be far more efficient to simply reuse them than recreating them again.
But, there's nothing anybody here can do but guess; would have to see a sample problem that can reproduce the problem to have any hope of debugging other than such generalities as above.
  2 Comments
Andrea Mazzolani
Andrea Mazzolani on 18 Jan 2023
Hi, thank you for the answer. I will check better the code.
I double check if I made a mistake in the arrays declaration, thanks
I didn't share the code because is about 500 lines of code and there are nested functions with the same amount of code. Do you mean it might be helpful if I share a folder with all scripts to run?
dpb
dpb on 18 Jan 2023
Well, completely debugging a large code is out of the scope of Answers, but you have to realize we cannot debug what we cannot see, either.
Usual debugging/code development advice holds; start with very small problems which you can see the inputs/outputs of on one code screen or less -- 10x10 is a convenient size and should be sufficient to verify algorithms. Then you can see where things start to get out of hand without being so overwhelmed by the size it's simply beyond comprehension to look at details.
Start at the lowest level of your user-written functions and do unit testing on them to ensure individually they aren't the culprit.

Sign in to comment.

Categories

Find more on Programming in Help Center and File Exchange

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!