Why is str2num not recommended when it is faster in certain circumstances?
Show older comments
I have a cell array of millions of strings representing dates with the format "yyyyMMddHHmmss"
I need to convert these to datetimes. After many attempts of various kinds I think I have found an optimal solution.
However, my solution requires that I use str2num instead of str2double and results in nearly 100x increase in speed. This is depite the fact that MATLAB recommends using str2double for "faster performance" and specifically discourages str2num.
In particular, str2double cannot convert a char array and results in Inf, while str2num converts the char array without issue.
Below is an example script.
(P.S. not directly related to this question, but if there is a faster way to convert cell arrays of strings to datetimes, let me know!)
%Make example input data
t1 = datetime(2000,1,1,0,0,0);
t2 = datetime("now");
t = string(datestr(t1:days(1):t2,'yyyymmddHHMMss'));
t_cell = cellstr(t); %<--- This is the "cell array" example data
%Option 1: datetime cell array of strings
% ---> Very slow (~0.75 s)
tic
tcheck1 = datetime(t_cell,'InputFormat','yyyyMMddHHmmss');
toc
%Option 2: str2double
% ---> DOES NOT WORK (results in Inf)
tic
da = char(t_cell);
year1 = str2double(da(:,1:4));
month1 = str2double(da(:,5:6));
day1 = str2double(da(:,7:8));
hour1 = str2double(da(:,9:10));
min1 = str2double(da(:,11:12));
sec1 = str2double(da(:,13:14));
tcheck2 = datetime(year1,month1,day1,hour1,min1,sec1);
toc
%Option 3: str2double with extra conversion from char to string
% ---> About twice as fast as Option #1 (~0.4 s)
tic
da = char(t_cell);
year1 = str2double(string(da(:,1:4)));
month1 = str2double(string(da(:,5:6)));
day1 = str2double(string(da(:,7:8)));
hour1 = str2double(string(da(:,9:10)));
min1 = str2double(string(da(:,11:12)));
sec1 = str2double(string(da(:,13:14)));
tcheck3 = datetime(year1,month1,day1,hour1,min1,sec1);
toc
%Option 3: str2num
% ---> About 100 times faster than Option #1 and #3 (~0.005 s)
tic
da = char(t_cell);
year1 = str2num(da(:,1:4));
month1 = str2num(da(:,5:6));
day1 = str2num(da(:,7:8));
hour1 = str2num(da(:,9:10));
min1 = str2num(da(:,11:12));
sec1 = str2num(da(:,13:14));
tcheck4 = datetime(year1,month1,day1,hour1,min1,sec1);
toc
Accepted Answer
More Answers (1)
Umar
on 4 Jul 2024
0 votes
Hi Darcy,
That’s is a very good catch. You asked, Why is str2num not recommended when it is faster in certain circumstances?
My suggestion is that str2num may offer speed advantages in specific scenarios, its drawbacks in terms of error handling, ambiguity, flexibility, readability, and future compatibility outweigh the performance gains.
You also asked, not directly related, if there is a faster way to convert cell arrays of strings to datetimes, let me know!
One efficient approach is to utilize vectorized operations and built-in functions provided by Matlab.Preallocating memory for the datetime array before conversion can improve performance. This can be achieved by initializing an empty datetime array with the desired size before populating it with converted values.For even faster conversion of large datasets, consider leveraging Matlab's Parallel Computing Toolbox. By parallelizing the conversion process, you can distribute the workload across multiple cores or workers, significantly reducing processing time.
Hope this answers your question.
Categories
Find more on Dates and Time in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!