Confusion about performance improvement by memory preallocation

2 views (last 30 days)
I have a complex task of extracting relevant measurement data from a stack of mat files (each has 237 variables). I have 30 mat files, the files 9 to 18 are relevant (can not delete them, let's just accept this here). In each relevant file, only a few data points shall be extracted (for each variable), their index is given by a separate file (containing the measurement times).
The meas_times.mat looks like this, where each column corresponds to a file (9:18) and the rows contain start and stop times in alternating order.
0 0 0 0 0 0 0 0 315 0
0 0 0 0 0 0 0 0 352 0
0 0 0 0 56 0 26 0 373 0
0 0 0 0 115 0 45 0 394 0
0 0 0 0 141 121 65 515 476 1
0 0 0 0 200 180 104 574 511 8
0 0 0 0 471 201 132 609 837 44
0 0 0 0 530 260 191 611 860 95
0 0 0 0 610 721 443 664 881 109
0 0 0 0 669 780 502 720 910 154
0 461 0 0 671 796 521 737 961 174
0 520 0 0 730 821 580 796 990 187
0 591 0 171 1216 1072 711 1126 1071 580
0 650 0 230 1275 1105 770 1185 1130 610
0 981 21 276 1311 1197 773 1241 1171 667
0 1040 80 335 1336 1256 832 1300 1200 695
508 1061 121 721 1629 1376 1390 1324 1311 783
567 1120 142 780 1662 1435 1449 1383 1340 842
601 1721 151 816 1677 1451 1482 1411 1721 1151
660 1780 188 875 1736 1510 1541 1470 1780 1210
Example:
Of file 9 (1st column), I only want data from index 508-567 and 601-660 in each variable. For file 10, 21-80 etc. ...
My first solution with consecutive for-loops was:
tic
%% initialize
clear
clc
load('meas_times.mat')
times = B;
%% define path
path = 'C:\Users\felix\Documents\HeatFlow\Messkampagne Itterbeck\Messdaten\';
list = dir(path);
%% create empty struct t to collect relevant data in
t = struct(load(sprintf('%s%s',path,list(9).name)));
fn = fieldnames(t);
for a = 1:numel(fn)
t.(fn{a})=[];
end
%% filter relevant data
for b = 9:18
% extract files from 8th to 18th, b is their position in the folder
Dateiname = sprintf('%s%s',path,list(b).name);
s = struct(load(Dateiname)); % this is the b-th file as struct
fn = fieldnames(s);
if fn{1} == "Abtastrate_1_HzZeit_1_Hz_" % removes a duplex variable that some files have
s = rmfield(s,'Abtastrate_1_HzZeit_1_Hz_');
fn = fn(2:end);
end
v = nonzeros(times(:,b-8)); % % extract the measurement times that correspond to the b-th file
for z = 1:115 % 1-Hz variables from row 1 to row 115
temp = []; % empty dummy array
for i = 1:2:length(v)
temp = vertcat(temp,s.(fn{z})(v(i):v(i+1)));
end
t.(fn{z}) = vertcat(t.(fn{z}),temp); % write extracted values into the new struct to the z-th variable
end
for z = 116:154 % 2-Hz variables up to row 154
% preprocessing: mean of 2Hz
for i = 1:2:numel(s.(fn{z}))-1
s.(fn{z})(i:i+1) = mean(s.(fn{z})(i:i+1));
end
s.(fn{z})(2:2:end) = [];
temp = []; % empty dummy array
for i = 1:2:length(v)
temp = vertcat(temp,s.(fn{z})(v(i):v(i+1)));
end
t.(fn{z}) = vertcat(t.(fn{z}),temp); % write extracted values into the new struct to the z-th variable
end
end
toc
However, this ran very slowly, because the last variables in each file contain a few million values (damn 1kHz loggers). As Matlab gave me the hint, that the temp array was changing size with every loop iteration and this would hurt performance. So I completely rewrote the whole script, included functions and preallocated memory. However, now the whole operation takes thrice as much time. Here is the "improved" script. Have a made any obvious mistake?
tic
%% initialize
clear
clc
load('meas_times.mat')
times = B;
%% define path
path = 'C:\Users\felix\Documents\HeatFlow\Messkampagne Itterbeck\Messdaten\';
list = dir(path);
%% create empty struct t to collect relevant data in
t = struct(load(sprintf('%s%s',path,list(9).name)));
fn = fieldnames(t);
for a = 1:numel(fn)
t.(fn{a})=[];
end
%% filter relevant data
for b = 9:18
% extract files from 8th to 18th, b is their position in the folder
filename = sprintf('%s%s',path,list(b).name);
s = struct(load(filename)); % this is the b-th file as struct
fn = fieldnames(s);
if fn{1} == "Abtastrate_1_HzZeit_1_Hz_" % removes a duplex variable that some files have
s = rmfield(s,'Abtastrate_1_HzZeit_1_Hz_');
fn = fn(2:end);
end
v = nonzeros(times(:,b-8)); % extract the measurement times that correspond to the b-th file
w = [[0,0]';diff(v)+1]; % important for indexing later on
for z = 1:154 % 154 variables in the struct s
switch z
case num2cell(1:115)
hz = 1;
t.(fn{z}) = vertcat(t.(fn{z}),selectdata(hz,z,v,w,s,fn));
case num2cell(116:154)
hz = 2;
t.(fn{z}) = vertcat(t.(fn{z}),selectdata(hz,z,v,w,s,fn));
end
end
end
toc
function temp = selectdata(hz,z,v,w,s,fn)
% preprocessing
for i = 1:numel(s.(fn{z}))/hz
s.(fn{z})(i:i+(hz-1)) = mean(s.(fn{z})(i:i+(hz-1)));
s.(fn{z})(i+1:i+(hz-1)) = [];
end
% empty dummy array
temp = zeros(sum(w(1:2:end)),1);
% extract data into temp
for i = 1:2:length(v)
temp(sum(w(i:-2:1))+1:sum(w(i+2:-2:1)),1) = s.(fn{z})(v(i):v(i+1));
end
end
Feel free to ask questions. I do not want help on my project, I want to understand why the code got SLOWER when I followed Matlabs instructions and changed the code in a way that it preallocated memory.
The 1st code took 9.7 seconds, the 2nd code approximately 31 seconds.
  9 Comments
Felix Schönig
Felix Schönig on 11 Dec 2020
Edited: Felix Schönig on 11 Dec 2020
Your input and Jan's are what I needed. I have understood your use of arrayfun and the anonymous function handle (didn't know that before), it's beautiful! :-) I also reworked the outer loop to use the fields feature of structs. Is this what you had in mind?
path = 'somepath';
list = dir(path);
for b = 9:18 % bad hard code indexing, I know :>
filename = sprintf('%s%s',path,list(b).name);
tempS = load(filename);
fn = fieldnames(tempS); %
if fn{1} == "Abtastrate_1_HzZeit_1_Hz_" % removes a duplex variable that some files have
tempS = rmfield(tempS,'Abtastrate_1_HzZeit_1_Hz_');
fn = fn(2:end);
end
u(b-8) = tempS;
end
This gives me a 1x10 struct which I can now work on using indexing as you recommended. For your information:
This field concatenation alone still takes 4.97 seconds; it's really a big bunch of data. I will try to improve speed with the indexing method in a function and report later.
E: I would mark your help as "Accept this answer", but it is only as a comment. Because of this I will accept Jan's answer, yet, a big thank you to you! You really helped me.
Bruno Luong
Bruno Luong on 11 Dec 2020
Edited: Bruno Luong on 11 Dec 2020
Yes that's what I have in mind. Might be something about allocation of u you could improve, but it's not much a big deal and it's a detail you can work on later.
Now what you could do outside this loop on u to build a single group should go something like this (you need to look for comma list MATLAB syntax to understand the code):
% test data, replace with your u array of structure
u(1) = struct('a', [0;1], 'b', [2;3])
u(2) = struct('a', [4], 'b', [5;6;7])
f = fieldnames(u);
data = cellfun(@(f) vertcat(u.(f)), f, 'unif', 0);
sarg = [f,data].';
sall = struct(sarg{:})

Sign in to comment.

Accepted Answer

Jan
Jan on 11 Dec 2020
Try to replace this part of the first solution:
for z = 1:115 % 1-Hz variables from row 1 to row 115
temp = []; % empty dummy array
for i = 1:2:length(v)
temp = vertcat(temp,s.(fn{z})(v(i):v(i+1)));
end
t.(fn{z}) = vertcat(t.(fn{z}),temp); % write extracted values into the new struct to the z-th variable
end
by:
for z = 1:115 % 1-Hz variables from row 1 to row 115
lenv = numel(v);
tmpC = cell(lenv / 2); % empty dummy array
sfnz = s.(fn{z}); % Cheap shared data copy instead of repeated indexing
for k = 1:lenv / 2
idx = 2 * k - 1;
tmpC{k} = sfnz(v(idx):v(idx + 1));
end
t.(fn{z}) = vertcat(t.(fn{z}), tmpC{:}); % write extracted values into the new struct to the z-th variable
end
The iterative growing or shrinking of arrays is extremely expensive. So avoid things like this:
s.(fn{z})(i+1:i+(hz-1)) = [];
A tiny exmple:
x = [];
for k = 1:1e6
x(k) = k;
end
This does not request 8MB (8 byte per double), but sum(1:1e6)*8MB, because in each iteration a new array is created the old contents is copied. This means more than 4 TB of RAM! Of course this is slow. With a pre-allocation and without a growing array, Matlab requests the expected 8MB only: x = zeros(1, 1e6). The same effect applies for an iterative shrinking.
In Matlab 2018b vertcat has some potential for improvements. Then it might be idea to test the speed with https://www.mathworks.com/matlabcentral/fileexchange/28916-cell2vec .
  1 Comment
Felix Schönig
Felix Schönig on 11 Dec 2020
Hello Jan,
great example, I begin to understand! I just tested and compared your code in a copy, now I am however confused again: It is by approximately half a second slower than my "bad first solution". (5.3s vs 5.7.. s) Any idea why?
However, you made a clear point about the growing/shrinking of arrays. I will rework my code to avoid this. I have a lot to digest right now, still busy with Bruno's hint.
Thank you!

Sign in to comment.

More Answers (0)

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!