readtable is ignoring import options to get variable names

87 views (last 30 days)
Am I being stupid or is this function not logical?
I want to import a csv file. There are 3 header lines. The actual variable names are on line 3. The units are on line 2. Line 1 is to be ignored.
So if I just opts = detectImportOptions and then set opts.VariableNamesLine=3, opts.VariableUnitsLine=2. It picks up the latter and ignores completely the former and just uses the original variablenames it picked up on Line 1.
If I detectImportOptions(file,'NumHeaderLines',1) it then picks up the units line as the names.
If I do it again and tell it to skip 2 lines, it picks the right names. I can then set opts.VariableUnitsLine to 2 and it does go back and pick the units correctly.
So I do get what I want in the end. But the function doesn't seem to work as expected? i.e. create the options and then modify the line options. Seems like whatever it initially picks up first as the names gets set in stone and you can't do anything about it (except the sorta hacky way I just worked out).
  5 Comments
Adam Danz
Adam Danz on 16 Aug 2018
You could anonymize the data or create a working example that has the same structure but fake data that produces the same behavior as your current file. Providing the relevant code would also be helpful.
jonas
jonas on 16 Aug 2018
I have not been able to solve it using readtable, but I was able to reproduce the problem easily using the attached textfile. So if anyone else wants to give it a try...

Sign in to comment.

Accepted Answer

Adam Danz
Adam Danz on 16 Aug 2018
Edited: Adam Danz on 16 Aug 2018
This works with Jonas's testfile.txt. You need to specify that there are 3 header lines when you call detectImportOptions().
opts = detectImportOptions(filename, 'NumHeaderLines', 3);
opts.VariableNamesLine = 3;
opts.VariableUnitsLine = 2;
opts.VariableNames
c = readtable(filename, opts);
  6 Comments
Adam Danz
Adam Danz on 16 Aug 2018
That sounds kludgy. If your data are formatted as you described and match the format of the testfile.txt, the solution above should work. Maybe you're using the 'TreatAsEmpty' or 'ReadVariableNames' parameters in readtable() which may interfere with the opts input. Just be vigilant in applying this to other files.
Alex Mason
Alex Mason on 16 Aug 2018
Yes, possibly because I notice that in the first line, the first two columns are actually empty. Well, they have [], [], but no actual text or numbers. Perhaps this is the reason.
Either way, all files are the same and I don't need to use on a different type of file.

Sign in to comment.

More Answers (1)

Jacob Hootman
Jacob Hootman on 8 Oct 2018
I had the same issue. I went through in debug several time; I believe this is a bug. Here is what I found:
Open TextImportOptions.m and go to line 211, it will read:
% Read Names
if opts.VariableNamesLine > 0 && rvn
names = readVariableNames(parser);
else
names = opts.SelectedVariableNames;
end
% Read Metadata
units = readVariableUnits(parser);
descr = readVariableDescriptions(parser);
The problem is that 'rvn' gets its value from a persistent variable, which means unless that parameter is specified on the first function call, it will always be false.
Change the &&, in the if statement, to 'OR' logic (read the 'NOTES' below, before doing so). Now the code will work as intended. This is what is should look like:
% Read Names
if opts.VariableNamesLine > 0 || rvn
names = readVariableNames(parser);
else
names = opts.SelectedVariableNames;
end
% Read Metadata
units = readVariableUnits(parser);
descr = readVariableDescriptions(parser);
Also, I'm not sure why the programmer decided to use an 'if else' statement to decide how to get the variable names, yet only calls a function to get the units and descriptions.
NOTES: (1) Making this change requires administrative access, (2) m file must be changed with a non matlab editor (ex: notepad++), (3) this change will only affect your local machine (i.e. other computers will have difficulties running if they do not have this change installed), (4) any updates that matlab installs may revert this code.
  7 Comments
jonas
jonas on 9 Oct 2018
Edited: jonas on 9 Oct 2018
@Guillaume: That explains it. The fact that there are several different versions is unfortunate as it becomes difficult to write complex importopts for beginners on this forum. Many times people just reply with an error message, and therefore I usually opt for something more reliable such as textscan despite readtable usually being the more practical choice for semi-complex imports.
Sorry for interrupting your discussion, I will be on my way now :)
Jacob Hootman
Jacob Hootman on 28 Oct 2018
@Adam Danz I just kept stepping into every function that resulted in an error. I called the readtable function with arguments for both the fileName and the OPTS.

Sign in to comment.

Products


Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!