Why is subscripted assignment so inefficient?

5 views (last 30 days)
Kenneth Johnson on 5 Feb 2022
Commented: Walter Roberson on 6 Feb 2022
Why is MATLAB's subscripted assignment so abysmally inefficient with more than 2 subscripts? e.g.,
>> clear, a = []; tic, for j = 1e3:-1:1, for k = 1e3:-1:1, a(j,k,1) = 1; end, end, toc
Elapsed time is 0.193743 seconds.
>> clear, a = []; tic, for j = 1e3:-1:1, for k = 1e3:-1:1, a(j,k) = 1; end, end, toc
Elapsed time is 0.004241 seconds.
Walter Roberson on 5 Feb 2022
Interesting. I thought for a moment that it might be due to the extra multiplication and addition needed each time, but when I rewrite the code in terms of linear indexing as-if a 3D array, then I do not see nearly that much performance drop.

Matt J on 5 Feb 2022
Edited: Matt J on 5 Feb 2022
I would guess it's only because you are using 3 subscripts on an array that only has 2 dimensions. This forces Matlab to to do extra work, namely running a check to make sure that in a(j,k,m) the 3rd subscript is m=1 and simplifying the subscripts from (j,k,1) to (j,k). When a actualy does have 3 dimensions, the times are much more comparable:
a=zeros(1e3,1e3,2);
tic,
for j = 1e3:-1:1
for k = 1e3:-1:1
a(j,k,1) = 1;
end
end
toc
Elapsed time is 0.019444 seconds.
a=zeros(1e3,1e3);
tic,
for j = 1e3:-1:1
for k = 1e3:-1:1
a(j,k) = 1;
end
end
toc
Elapsed time is 0.016890 seconds.
Walter Roberson on 6 Feb 2022
Unfortunately I am running into time limits a lot in exploring the timing of allocations. With linear increasing allocations, if we assume approximately linear time, then we run into the 1t+2t+3t+4t+...n*t = t*n*(n+1)/2 problem, that the time needed to explore the suspected linear relation in a linear manner, requires squared time. I think I might need to switch to exponential increases in memory sizes to probe more closely.
N = 4;
G = 6;
start = datetime('now');
%timing pushed into a function to allow for potential JIT
t = run_timing(N, G);
stop = datetime('now');
elapsed = seconds(stop - start)
elapsed = 51.7211
plot(t);
xlabel('try #'); ylabel('seconds')
legend((1:G) + "G")
title('absolute time')
tmean = mean(t,1);
plot(tmean)
xlabel(' gigabytes'); ylabel('mean seconds')
p = polyfit(1:G, tmean, 1);
fprintf('roughly %g seconds + %g seconds per gigabyte\n', p(2), p(1))
roughly -0.00468463 seconds + 0.616974 seconds per gigabyte
function t = run_timing(N,G)
t = zeros(N,G);
for g = 1 : G
for n = 1 : N
tic();
initgig(g);
t(n,g) = toc;
end
end
end
%gigabyte 1073741824
%megabyte 1048576
function initgig(M)
a = ones(1,1073741824*M,'uint8');
end
Walter Roberson on 6 Feb 2022
There is a lot of noise in these measurements, and it is difficult to tell what is meaningful or not.
I skip the first few runs in the plot because in nearly all runs there is a bad peak early on.
The second plot here is the more interesting one, showing the log of how the time to allocate increases with amount requested. Noise affects it a lot, so I can only describe trends that I encountered a fair bit:
• time to allocate 1 byte starts off a bit higher
• time to allocate a quite small number of bytes but more than 1, falls a bit, then rises a bit.
• there does seem to be a peak at 256 bytes
• and then it does seem to fall a bit
• until at 2^14 there seems to be a distinct rise. This would correspond to 16384 bytes. I suspect that either 4096 or 16384 bytes is the size of an entry in the "small block pool"
• in some of the plots, there was a leveling from 2^14 to 2^16; the more fine-grained I draw the less I see this, so it might have been a statistical artificat
• from 2^16 onward, the log grows pretty much linearly as log2() of number of bytes increases. To phrase that a different way: from 2^16 onward, the allocation time grows linearly with number of bytes allocated. The actual boundary might possibly be 2^14
• so, below 2^14, there appear to be at least two different allocation strategies, but the boundaries between them are a bit difficult to find. If there were only one allocation strategy for that range, using a free block pool with entries of size 2^14, then you would expect the time to be constant up to that point, but there do seem to be peaks.
• I seem to recall that 256 bytes is the limit below which for some constant allocations and some colon expressions, that MATLAB keeps a copy around as an optimization, to be handed out when encountering another request with the same spelling (spacing even being important!). Perhaps that is why we see a blip at 256??
If someone cares to do a study of how small block allocation is done in MATLAB, I suggest 2^14 as the upper limit to study (and remember to take into account those new hidden copies that MATLAB recently starting taking for small constants.)
N = 50;
G = 2^24;
start = datetime('now');
%timing pushed into a function to allow for potential JIT
[t, gvals] = run_timing(N, G);
stop = datetime('now');
elapsed = seconds(stop - start);
display(elapsed)
elapsed = 0.5941
figure(1)
%tfilt = filloutliers(t, 'center','movmedian', 3, 1);
startat = 6;
semilogy(startat:N, t(startat:end,:));
xlabel('try #'); ylabel('seconds')
title('absolute time')
%legend("2^{"+string(gvals)+"}", 'NumColumns',4);
figure(2)
tmean = mean(t(startat:end,:),1);
semilogy(gvals, tmean)
xt = 1:ceil(max(gvals));
xticks(xt)
xticklabels("2^{"+string(xt)+"}");
xtickangle(45)
xlabel('bytes')
ylabel('mean seconds')
plin = polyfit(2.^gvals, tmean, 1);
%plog = polyfit(log2(gvals), log2(tmean),1);
fprintf('linear: roughly %g seconds + %g seconds per gigabyte\n', plin(2), plin(1)*2^30);
linear: roughly -4.90643e-08 seconds + 0.0490081 seconds per gigabyte
%fprintf('log: roughly %g seconds + %g seconds per gigabyte\n', 2.^plog(2), plog(1));
function [t,gvals] = run_timing(N,G)
%wb = waitbar(0, '0G');
%cleanMe = onCleanup(@()delete(wb));
gvals = 0:.1:ceil(log2(G));
ng = length(gvals);
t = zeros(N,ng);
for g = 1 : ng
gp = gvals(g);
gv = round(2.^gp);
%waitbar(0, wb, "2^{"+string(gp)+"}");
for n = 1 : N
tic();
initmem(gv);
t(n,g) = toc;
%waitbar(n/N, wb);
end
end
end
%gigabyte 1073741824
%megabyte 1048576
function initmem(M)
ma = ones(1,M,'uint8');
end

Categories

Find more on Performance and Memory in Help Center and File Exchange

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!