Why DatasetRef 'get' method is faster with index rather than name?

6 views (last 30 days)
I would like to load some elements from a big dataset (output of a Simulink simulation).
I decided to use the Simulink.SimulationData.DatasetRef() fuction to avoid loading the entire dataset in my workspace. For example:
ref_data = Simulink.SimulationData.DatasetRef(saving_path, "logout");
Then, I tried to use the get() method of the DatasetRef to load some elements. I noticed that if I pass the element name, the method is slow, whereas if I pass the element index, the method is much faster.
Here there is an example:
clear
saving_path = 'dataset.mat';
el_name = 'el_name';
ref_data = Simulink.SimulationData.DatasetRef(saving_path, "logout");
tic
a = ref_data.get(el_name).Values;
disp('Time with name:')
toc
tic
index = find(strcmp(ref_data.getElementNames, el_name));
b = ref_data.get(index).Values;
disp('Time with index:')
toc
if isequal(a,b)
disp('a and b are equal')
end
The result is:
Time with name:
Elapsed time is 4.327172 seconds.
Time with index:
Elapsed time is 0.035908 seconds.
a and b are equal
(tested in Matlab R2024b abd Matlab R2022b)
Why does the call with the element name take much more time?
The solution with the index is simple and effective, but less readable.
  3 Comments
Marco
Marco on 4 Jun 2025
Thank you very much for your help!
I switched the order of the calls and use the methods 2 times.
The get() calls with the index are still much faster and the second try seems not to improve:
New code:
clear
saving_path = 'database.mat';
el_name = 'el_name';
ref_data = Simulink.SimulationData.DatasetRef(saving_path, "logout");
tic
index = find(strcmp(ref_data.getElementNames, el_name));
b = ref_data.get(index).Values;
disp('1) Time with index:')
toc
tic
index = find(strcmp(ref_data.getElementNames, el_name));
b = ref_data.get(index).Values;
disp('2) Time with index:')
toc
tic
a = ref_data.get(el_name).Values;
disp('3) Time with name:')
toc
tic
a = ref_data.get(el_name).Values;
disp('4) Time with name:')
toc
if isequal(a,b)
disp('a and b are equal')
end
New results:
1) Time with index:
Elapsed time is 0.035707 seconds.
2) Time with index:
Elapsed time is 0.026322 seconds.
3) Time with name:
Elapsed time is 5.260193 seconds.
4) Time with name:
Elapsed time is 5.448796 seconds.
a and b are equal
Walter Roberson
Walter Roberson on 4 Jun 2025
Interesting. I would expect minor differences, but no-where near the difference that you see.

Sign in to comment.

Accepted Answer

Ronit
Ronit on 16 Jul 2025
Edited: Ronit on 16 Jul 2025
Hello @Marco,
This slowdown happens because 'get(name)' does a linear search through all element names each time, which is slow for large datasets. In contrast, 'get(index)' directly accesses the element, making it much faster.
If you need to access elements by name but want better speed, I recommend building your own mapping at the start as a workaround:
names = ref_data.getElementNames;
name2idx = containers.Map(names, 1:numel(names));
idx = name2idx(el_name);
a = ref_data.get(idx).Values;
Setting up the map is also linear time, but you only pay this cost once. The big advantage is that all subsequent lookups by name are extremely fast (constant time), rather than slow linear searches every time. This makes a big difference if you need to access many elements by name repeatedly.
Please refer to the documentation page for 'containers.Map' for more details: https://www.mathworks.com/help/matlab/ref/containers.map.html
I hope this helps with your query!

More Answers (0)

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!