importdata produces strange output when reading a textfile with blank cells and text

4 views (last 30 days)
I want to import the attached file which is the result of matlabs function run. As you can see, there are a couple of blank cells and lines of text at the end and beginning of the file. I am only interested in the numerical data, which is a 16x6 matrix for that particular txt file. The empty cells be NaN.
Importdata is able to separate the text from the numerical data, however, thinks that there are only 3 columns because there are three blank cells in the first row.
If I remove the text lines manually and call readtable(), there are, for whatever reasons, the first three lines of the numerical data missing.
How can I get the 16x6 matrix from the text file where blank cells = NaN?
Thank you!

Accepted Answer

Cris LaPierre
Cris LaPierre on 1 Feb 2023
You will have to take into the consideration the formatting of your file and may have to specify some additiional settings to get what you want.
  • Your file is fixed width, not delimited.
  • Every row does not contain delimeters for all 6 columns
VariableNames = {'Index', 'exitFlag', 'f(x)', 'iter', 'F-count', 'optimality'};
VariableWidths = [ 8, 10, 13, 11, 9, 14 ];
opts = fixedWidthImportOptions('VariableWidths',VariableWidths,'VariableNames',VariableNames);
opts.DataLines = [4 19];
opts = setvartype(opts,"double");
data = readtable("command_window.txt",opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
data = 16×6 table
Index exitFlag f_x_ iter F_count optimality _____ ________ _________ ____ _______ __________ 1 -10 NaN NaN 14 NaN 2 3 1.239e-09 39 40 3.484e-06 3 3 3.028e-22 45 46 2.174e-12 4 -10 NaN NaN 4 NaN 5 -10 NaN NaN 7 NaN 6 3 5.749e-13 17 18 5.438e-08 7 -10 NaN NaN 3 NaN 8 -10 NaN NaN 2 NaN 9 -10 NaN NaN 5 NaN 10 -10 NaN NaN 1 NaN 11 -10 NaN NaN 2 NaN 12 -10 NaN NaN 3 NaN 13 -10 NaN NaN 5 NaN 14 -10 NaN NaN 1 NaN 15 -10 NaN NaN 2 NaN 16 -10 NaN NaN 2 NaN
  7 Comments
Cris LaPierre
Cris LaPierre on 2 Feb 2023
I like to inspect text files in a text editor that can show non-visible characters (I use Notepad++)
Double checking here, the column widths are
VariableWidths = [ 8, 10, 13, 10, 10, 14 ];
dpb
dpb on 2 Feb 2023
So do I (inspect w/ "real" programmer's editor), but many newbies aren't that familiar with other toolsets so used the MATLAB representation. I have a shareware/freeware version of the old Brief editor that works with Windows that is my comfortable tool; having used Brief ever since nearly its introduction the key mapping is just too ingrained to get used to something else.

Sign in to comment.

More Answers (1)

dpb
dpb on 1 Feb 2023
>> opt=detectImportOptions('command_window.txt','ExpectedNumVariables',6,'NumHeaderLines',2,"ReadVariableNames",1,'VariableNamingRule','preserve');
>> readtable('command_window.txt',opt)
ans =
18×6 table
Index exitflag f(x) # iter F-count
_____ ________ _________ ___ ____ _________
1 -10 14 NaN NaN NaN
2 3 1.239e-09 39 40 3.484e-06
3 3 3.028e-22 45 46 2.174e-12
4 -10 4 NaN NaN NaN
5 -10 7 NaN NaN NaN
6 3 5.749e-13 17 18 5.438e-08
7 -10 3 NaN NaN NaN
8 -10 2 NaN NaN NaN
9 -10 5 NaN NaN NaN
10 -10 1 NaN NaN NaN
11 -10 2 NaN NaN NaN
12 -10 3 NaN NaN NaN
13 -10 5 NaN NaN NaN
14 -10 1 NaN NaN NaN
15 -10 2 NaN NaN NaN
16 -10 2 NaN NaN NaN
NaN NaN NaN NaN NaN NaN
3 NaN NaN 16 NaN NaN
>>
will then need to clean up the last couple records after the Index sequence stops...
Alternatively, if know the length of the file a priori,
>> tT=readtable('command_window.txt','Range','3:19','ExpectedNumVariables',6,'VariableNamingRule','preserve')
tT =
16×6 table
Index exitflag f(x) # iter F-count
_____ ________ _________ ___ ____ _________
1 -10 14 NaN NaN NaN
2 3 1.239e-09 39 40 3.484e-06
3 3 3.028e-22 45 46 2.174e-12
4 -10 4 NaN NaN NaN
5 -10 7 NaN NaN NaN
6 3 5.749e-13 17 18 5.438e-08
7 -10 3 NaN NaN NaN
8 -10 2 NaN NaN NaN
9 -10 5 NaN NaN NaN
10 -10 1 NaN NaN NaN
11 -10 2 NaN NaN NaN
12 -10 3 NaN NaN NaN
13 -10 5 NaN NaN NaN
14 -10 1 NaN NaN NaN
15 -10 2 NaN NaN NaN
16 -10 2 NaN NaN NaN
>>
truncates on the way in....but that means knowing the length first which may not be generally so.
Avoid importdata -- it's intended to be easy to use, but that comes at a price that while it does something, you don't necessarily know what it's going to do with any given file.
Alternatively to the above, there's always the recourse to revert to parsing generic text files on a line-by-line basis to handle the somewhat unconforming formatting found in machine-generated output files regarding variable names, treatment of missing values, deliminters, etc., etc., etc., .... that a generic routine just can't handle every possible permutation it might find without some klews...
  1 Comment
SA-W
SA-W on 1 Feb 2023
The output produced by your code is wrong. In the first row, for instance, you have
1 -10 14 NaN NaN NaN
Correct output would be
1 -10 NaN NaN 14 NaN
This is exactly the same issue as caused by importdata

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!