Surprising speed differences when selecting from arrays

2 views (last 30 days)
I have encountered a rather surprising and large variation in speed when making selections from array data formatted in different ways. The most natural way for the data to be stored for my problem is as a three dimensional array, essentially storing several pages of matrix data. My code requires that I make selections based on the values of columns of this data with each column selection depending on the values stored in each page for the same column. The original code is rather long but this example demonstrates the essential essence of what I have observed.
function slowIndexCheck(np,nk)
vals = rand(np,nk,3);
thresh = [0.3,0.4,0.5];
valsStack = [vals(:,:,1),vals(:,:,2),vals(:,:,3)];
vals1 = squeeze(vals(:,:,1));
vals2 = squeeze(vals(:,:,2));
vals3 = squeeze(vals(:,:,3));
valsCell = cell(3,1);
valsCell{1} = vals1;
valsCell{2} = vals2;
valsCell{3} = vals3;
t = zeros(5,1);
str = cell(5,1);
tic
for il = 1:nk
sel = false(np,1);
for jl = 1:3
sel = sel | vals(:,il,jl) > thresh(jl);
end
end
t(1) = toc;
str{1} = 'Paged matrices: ';
tic
for il = 1:nk
sel = false(np,1);
for jl = 1:3
sel = sel | valsStack(:,il+(jl-1)*nk) > thresh(jl);
end
end
t(2) = toc;
str{2} = 'Stacked matrices: ';
tic
for il = 1:nk
sel = false(np,1);
sel = sel | vals1(:,il) > thresh(1);
sel = sel | vals2(:,il) > thresh(2);
sel = sel | vals3(:,il) > thresh(3);
end
t(3) = toc;
str{3} = 'Separate matrices: ';
tic
for il = 1:nk
sel = false(np,1);
for jl = 1:3
sel = sel | valsCell{jl}(:,il) > thresh(jl);
end
end
t(4) = toc;
str{4} = 'Cell stored matrices: ';
tic
for il = 1:nk
sel = false(np,1);
for jl = 1:3
vTmp = valsCell{jl};
sel = sel | vTmp(:,il) > thresh(jl);
end
end
t(5) = toc;
str{5} = 'Cell stored matrices + temporary intermediate variable: ';
disp(' Matlab surprising array selection speed demonstration')
disp('===========================================================');
disp(['np = ',num2str(np),', nk = ',num2str(nk)]);
disp(' ')
for il = 1:5
disp([str{il},num2str(t(il)),' s']);
end
Running the code for np=1000000, nk=100 on my i9-9940X desktop produces the following results:
>> slowIndexCheck(1000000,100)
Matlab surprising array selection speed demonstration
===========================================================
np = 1000000, nk = 100
Paged matrices: 1.4364 s
Stacked matrices: 0.075995 s
Separate matrices: 0.059846 s
Cell stored matrices: 1.4534 s
Cell stored matrices + temporary intermediate variable: 0.071385 s
As can be seen there is a factor of >20 difference between the fastest and slowest approaches and the results show two clear groups. As well as the slow performance from the paged matrix approach I also get slow performance from using a cell array but counter-intuitively far faster performance if I first load the cell data into a new variable, despite the apparent overhead that this may create. If I run the code as a script rather than calling it as a function then I get a very similar execution speed from all variants (around 1.4-1.5s). My understanding is that scripts are interpreted whereas functions are put through the JIT compiler so my best guess is that this is related to optimizations that the JIT compiler can or cannot make. If this is the case then it seems surprising that the cell array approach would be slow whereas the cell array + temporary indexing approach would be faster. Of course, this may have nothing to do with the JIT compiler - the main thing is that it is a very surprising result and it would be very good to know if I have missed something as it will have an impact on how I write code in future.

Answers (0)

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!