- indexing one numeric array vs
- indexing into one struct array and then indexing into a numeric array.
Numeric vector indexing slows down when the vector is a struct field. Why?
5 views (last 30 days)
Show older comments
In the speed comparision below, the only difference between Version 1 and Version 2 is that the vector being updated is the field of a struct. I thought struct fields were supposed to behave essentially the same as individual variables, and that dot indexing was just a way of giving a common prefix to their names, in this case "c.". Why then does Version 2 run 6 times more slowly?
clear, close all
N_range = 1e5:1e5:1e6;
times = zeros(1, length(N_range));
%%Version 1
N_i = 1;
for N = N_range
y=zeros(1,N,'uint32');
tic
for i=2:N
y(i)=y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times0=times;
%%Version 2
N_i = 1;
for N = N_range
c.y=zeros(1,N,'uint32');
tic
for i=2:N
c.y(i)=c.y(i-1)*rand;
end
times(N_i) = toc;
N_i = N_i + 1;
end
times1=times;
plot(1:N_i-1,times0, 1:N_i-1, times1); legend('Version 1', 'Version 2','location','northwest')
2 Comments
Stephen23
on 21 Mar 2025
Edited: Stephen23
on 21 Mar 2025
"...and that dot indexing was just a way of giving a common prefix to their names"
No, dot indexing is not a "prefix to their names". Dot indexing is indexing. Which means your comparison is
In other words, with a structure MATLAB must first find the location of the nested numeric array in memory before it can index into it, and that dereferencing takes a finite non-zero amount of time. Calling SUBSREF/SUBSASGN requires some time regardless of the array type.
Lets also compare against cell indexing:
R = 1e5:1e5:1e6;
T = nan(1,numel(R));
% Version 1
for k = 1:numel(R)
n = R(k);
y = zeros(1,n,'uint32');
tic
for ii = 2:n
y(ii) = y(ii-1)*rand;
end
T(k) = toc;
end
times1 = T;
% Version 2
for k = 1:numel(R)
n = R(k);
s = struct('y',zeros(1,n,'uint32'));
tic
for ii = 2:n
s.y(ii) = s.y(ii-1)*rand;
end
T(k) = toc;
end
times2 = T;
% Version 3
for k = 1:numel(R)
n = R(k);
c = {zeros(1,n,'uint32')};
tic
for ii = 2:n
c{1}(ii) = c{1}(ii-1)*rand;
end
T(k) = toc;
end
times3 = T;
plot(R,times1, R,times2, R,times3);
legend('1 x indexing', '2 x indexing (struct)', '2 x indexing (cell)', 'location','northwest')
See also:
Walter Roberson
on 21 Mar 2025
By the way, you get the same results as @Stephen23 showed if you put the work into functions -- with the idea being that functions would be optimized by the Execution Engine even if scripts are not optimized.
R = 1e5:1e5:1e6;
T = nan(1,numel(R));
% Version 1
for k = 1:numel(R)
n = R(k);
T(k) = ver1(n);
end
times1 = T;
% Version 2
for k = 1:numel(R)
n = R(k);
T(k) = ver2(n);
end
times2 = T;
% Version 3
for k = 1:numel(R)
n = R(k);
T(k) = ver3(n);
end
times3 = T;
plot(R,times1, R,times2, R,times3);
legend('1 x indexing', '2 x indexing (struct)', '2 x indexing (cell)', 'location','northwest')
function time = ver1(n)
y = zeros(1,n,'uint32');
tic
for ii = 2:n
y(ii) = y(ii-1)*rand;
end
time = toc;
end
function time = ver2(n)
s = struct('y',zeros(1,n,'uint32'));
tic
for ii = 2:n
s.y(ii) = s.y(ii-1)*rand;
end
time = toc;
end
function time = ver3(n)
c = {zeros(1,n,'uint32')};
tic
for ii = 2:n
c{1}(ii) = c{1}(ii-1)*rand;
end
time = toc;
end
Accepted Answer
Walter Roberson
on 21 Mar 2025
I thought struct fields were supposed to behave essentially the same as individual variables, and that dot indexing was just a way of giving a common prefix to their names
If we temporarily neglect potential effects of JIT compilation from the execution engine:
Direct variable use requires examining the symbol table to find the variable name, then examining the index given, validating the index, performing the indexing and building a new anonymous scalar variable containing the value, then using the anonymous scalar in the expression.
Structure dot indexing requires examining the symbol table to find the structure name, then examining the field name given, looking up the field name to get an offset into the struct, looking at the offset to find an anonymous variable, then examining the index given, validating performing the indexing and building a new anonymous scalar variable containing the value, then using the anonymous scalar in the expression.
The main difference is in the time taken to look up the field name in the struct, and the slower access because the variable name does not contain the value directly and instead the indexed variable name contains the value.
The effects of JIT compilation in these two cases is unknown. We can suspect that iterative indexing will be optimized. We can guess that maybe structure indexing with fixed field will be well-optimized... but unfortunately we just do not know.
The extra time taken to look up the field name and find the referenced value could account for the difference in timing.
3 Comments
James Tursa
on 21 Mar 2025
It's been this way for a long time, which is why I got in the habit of assigning struct variables into temp variables before loops like this. E.g., cy = c.y, then index into cy. And yes, it would be nice if the JIT could figure this out instead of the user writing workaround code.
More Answers (0)
See Also
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!