readtable of csv file with opts.DataLines =[n1 n2] and n1>2 doesn't work as expected
15 views (last 30 days)
Show older comments
Hello,
I'm trying to read a csv file by blocks, according to documentation, this shoudl work:
dir_load='some_dir';
file='some_file';
filename=fullfile(dir_load,file);
opts = detectImportOptions(filename);
opts.DataLines = [1 10];
T1=readtable(filename,opts);
opts.DataLines = [11 20];
T2=readtable(filename,opts);
opts.DataLines = [1 20];
T=readtable(filename,opts);
So, this T should be "[T1;T2]", but what i got is that T1 actually have lines 1 to 10 and T2 contains lines 6 to 15. What I'm doing wrong? You can find the file here.
T =
20x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
21.63 22.63 502 0.135
21.63 22.63 503 0.139
21.63 22.63 503 0.134
21.63 22.63 508 0.136
21.63 22.63 505 0.142
T1 =
10x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
T2 =
10x4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
edited by Guillaume to attach the file to the question. Please don't use external file sharing sites
0 Comments
Accepted Answer
Guillaume
on 25 Jun 2019
If you look at the actual content of the file, you see that it has a blank line between each line of data. Although blank lines are ignored by default during reading, they still count for the purpose of line counting, so it's normal that line 11 is only the 6th line of data (because of the 5 blank lines ignored).
Now, there is indeed a bug with the end point of DataLines. For me (R2019a), I get
>> opts.DataLines = [1 10];
>> readtable('ex.txt', opts)
ans =
5×4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 514 0.104
21.57 22.65 502 0.106
21.57 22.65 498 0.114
21.57 22.65 491 0.121
21.57 22.65 486 0.118
>> opts.DataLines = [11 20];
>> readtable('ex.txt', opts)
ans =
10×4 table
Var1 Var2 Var3 Var4
_____ _____ ____ _____
21.57 22.65 487 0.121
21.57 22.65 483 0.127
21.57 22.65 486 0.125
21.57 22.65 487 0.125
21.63 22.65 485 0.131
21.63 22.65 491 0.13
21.63 22.65 489 0.127
21.63 22.65 493 0.134
21.63 22.65 497 0.135
21.63 22.65 496 0.131
The result with DataLines = [1 10] I expected. The result wit DataLines = [11 20] has too many rows.
I will investigate a bit more then report to mathworks.
2 Comments
Guillaume
on 25 Jun 2019
Yes, the problem only shows if there are blank lines (or any skipped lines under the EmptyLineRule of the importoptions).
I've reported the bug.
More Answers (0)
See Also
Categories
Find more on Language Support in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!