Textscan doesn't do what it's told

3 views (last 30 days)
I have a function to read large data sets and extract the data but every once in a while there is a random string in there and it just stops so i told it to treat as empty but it didn't work and when it gave me the error and showed me the line the treat as empty part wasn't even shown.
This is the error
Error using textscan
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 14535, field number 46) ==> #IND00 13.038326 89.999989 53.000000 66.605282 359.999965
13.013328 90.000023 36.000000 210.258485 359.016463 15.170686 270.000018 48.000000 265.060631 180.778297 14.631801 270.000025 1.#QNAN0 1.#QNAN...
Error in importfile (line 13)
dataArray = textscan(fid, formatSpec, endRow(1)-startRow(1)+1, 'Delimiter', delimiter, 'MultipleDelimsAsOne', true, 'TextType', 'string',
'EmptyValue', NaN, 'HeaderLines', startRow(1)-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');

Accepted Answer

Walter Roberson
Walter Roberson on 13 Jul 2017
I have made the same mistake myself, thinking that EmptyVal instructed textscan to treat strings in numeric fields as if they were NaN. Instead, EmptyVal tells textscan which value to substitute when it detects an empty field.
The key to this is the 'TreatAsEmpty' option.
S = '13.013328 90.000023 36.000000 210.258485 359.016463 15.170686 270.000018 48.000000 265.060631 180.778297 14.631801 270.000025 1.#QNAN0 -1.#IND00 7.1234';
textscan(S,fmt,'TreatAsEmpty', {'1.#QNAN0', '-1.#IND00'})
ans =
1×15 cell array
Columns 1 through 9
{[13.013328]} {[90.000023]} {[36]} {[210.258485]} {[359.016463]} {[15.170686]} {[270.000018]} {[48]} {[265.060631]}
Columns 10 through 15
{[180.778297]} {[14.631801]} {[270.000025]} {[NaN]} {[NaN]} {[7.1234]}
You should watch out for 1.#INF00 and -1.#INF00 which stand for +infinity and -infinity . You could add them to the list to TreatAsEmpty, but then they would come out as NaN instead of as infinities.
If you need to process the #INF00 as infinities then you are going to need to read the file in as text and do text replacements before you do textscan(). You can pass a string to textscan in place of a file identifier in order to process the content of the string.
  3 Comments
Walter Roberson
Walter Roberson on 13 Jul 2017
The error message shows that your file has 1.#QNAN0 in it. You are only processing -1.#IND00 . I show treating both as empty, and I used your actual data and showed the result of testing on it.
michael warshowsky
michael warshowsky on 13 Jul 2017
Edited: michael warshowsky on 13 Jul 2017
When i try it i still get the same error why could this be and when i get the error it shows me the line and says im using 'EmptyValue', NaN when im actually using 'TreatAsEmpty', {'1.#QNAN0', '-1.#IND00'} like you said

Sign in to comment.

More Answers (1)

Star Strider
Star Strider on 13 Jul 2017
I cannot follow your code. However, using frewind repositions the file pointer at the beginning of the file. I would use the fseek function instead.
This is from a previous Answer (that worked) and illustrates the sort of approach I would take with a file such as yours:
st = fseek(fidi, 1, 'bof'); % Position File After First Line
k1 = 1; % Counter
while (st == 0) && (~feof(fidi)) % Test For End Of File Or Unsuccessful ‘fseek’ File Position
data{k1} = textscan(fidi, '%f%f', 'Delimiter','\t', 'HeaderLines',4, 'CollectOutput',1);
st = fseek(fidi, 50, 'cof'); % Position File Pointer To Next Line After Stop
k1 = k1 + 1;
end
You will obviously have to experiment with it to get it to work with your file. I would retain the while conditions, since the combination will prevent an infinite loop if the file does not have a valid end-of-file indicator.
  5 Comments
Star Strider
Star Strider on 13 Jul 2017
One option might be to add to your textscan arguments:
'TreatAsEmpty',{'-1#IND00'}
and any other weird strings that exist in your file that you can find and define. Try that first.
If you still have problems, use your textscan format descriptor instead of mine, then see if my code works with your file.
michael warshowsky
michael warshowsky on 13 Jul 2017
if you look at the code i attached i did in fact add that but it never processes it

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!