single vs multiple fprintf() efficiency
8 views (last 30 days)
Show older comments
Hi,
I want to write some text data (integer and string types) in a .txt file using fprintf() and I am looking for the most efficient way of doing so. I was under the impression that using a single fprintf() instead of multiple would be efficient but that is not the case as evident from the codes below.
This first approach takes approx 4 seconds to write.
str1 = "str1";
str2 = "str2str2";
str3 = "str3str3str3";
str4 = "str4str4str4str4";
str5 = "str5str5str5str5str5";
formatSpec = '%d%s%s%s%s%s%04X\n';
fid = fopen("post.txt", 'Wt');
tic
for index = 1:10000
fprintf(fid, formatSpec, index, str1, str2, str3, str4, str5, index);
end
toc
fclose(fid);
This second approach takes approx 1 sec to write although it has multiple calls to fprintf().
str1 = "str1";
str2 = "str2str2";
str3 = "str3str3str3";
str4 = "str4str4str4str4";
str5 = "str5str5str5str5str5";
str_newline = "\n";
fid = fopen("post.txt", 'Wt');
tic
for index = 1:10000
fprintf(fid, '%d', index);
fprintf(fid, str1);
fprintf(fid, str2);
fprintf(fid, str3);
fprintf(fid, str4);
fprintf(fid, str5);
fprintf(fid, '%04X', index);
fprintf(fid, str_newline);
end
toc
fclose(fid);
I am aware that I am not specifying the format in the middle calls. But the file written is the same in both cases.
- What could be the reason of this behavior?
- What can I do to skip the format of some of the fields (e.g. string fields) in formatSpec option in approach 1?
- How to specify the formatSpec option if you want to pass string arrays instead of single strings (I get an error). If I want to write all the strings data in one go?
- What would be the most efficient way to write data in my use case (integer + string type field)?
Thanks.
2 Comments
Walter Roberson
on 7 Feb 2018
"I am aware that I am not specifying the format in the middle calls. But the file written is the same in both cases."
... but would not be if the strings contained any % or \ characters.
Answers (2)
Jos (10584)
on 7 Feb 2018
You can skip the format identifier for string inputs, simply because fprintf uses the format identifier to transform its inputs into strings. No need to do that when the inputs are strings already ...
So, use fprintf(fid,'%s',str#) for a fair comparison regarding timings. You'll probably see that the second code will run slower :)
4 Comments
Walter Roberson
on 7 Feb 2018
I wonder how the timing changes if you were using character vectors instead of string objects ?
Jan
on 7 Feb 2018
Edited: Jan
on 7 Feb 2018
Replace
fprintf(fid, str1);
by
fwrite(fid, str, 'char')
to avoid that fprintf tries to parse the string.
You can omit the format specifier for static strings, by parsing them outside the loop:
str1 = 'str1';
str2 = 'str2str2';
str3 = 'str3str3str3';
str4 = 'str4str4str4str4';
str5 = 'str5str5str5str5str5';
newline = char([13, 10]);
fid = fopen('post.txt', 'W');
formatSpec2 = '%d%s%s%s%s%s%%04X\n'; % See the '%%04x'
formatSpec = [sprintf(formatSpec2, str1, str2, trs3, str4, str5), newline];
fprintf(fid, formatSpec, 1:10000);
fclose(fid);
The idea is: Create the fixed parts once only. With your example code, you can even omit the loop.
The text mode requires to detect the line breaks. In binary mode and with hard coded line breaks, the code should be faster and does not depend on the platform.
Remember that the timing of file access depends on the operating system and hard disk also: There are caches in the OS and on the disk. Creating a file twice by the same method can need very different times.
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!