Textscan doesn't do what it's told

I have a function to read large data sets and extract the data but every once in a while there is a random string in there and it just stops so i told it to treat as empty but it didn't work and when it gave me the error and showed me the line the treat as empty part wasn't even shown.
This is the error
Error using textscan
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 14535, field number 46) ==> #IND00 13.038326 89.999989 53.000000 66.605282 359.999965
13.013328 90.000023 36.000000 210.258485 359.016463 15.170686 270.000018 48.000000 265.060631 180.778297 14.631801 270.000025 1.#QNAN0 1.#QNAN...
Error in importfile (line 13)
dataArray = textscan(fid, formatSpec, endRow(1)-startRow(1)+1, 'Delimiter', delimiter, 'MultipleDelimsAsOne', true, 'TextType', 'string',
'EmptyValue', NaN, 'HeaderLines', startRow(1)-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');

 Accepted Answer

I have made the same mistake myself, thinking that EmptyVal instructed textscan to treat strings in numeric fields as if they were NaN. Instead, EmptyVal tells textscan which value to substitute when it detects an empty field.
The key to this is the 'TreatAsEmpty' option.
S = '13.013328 90.000023 36.000000 210.258485 359.016463 15.170686 270.000018 48.000000 265.060631 180.778297 14.631801 270.000025 1.#QNAN0 -1.#IND00 7.1234';
textscan(S,fmt,'TreatAsEmpty', {'1.#QNAN0', '-1.#IND00'})
ans =
1×15 cell array
Columns 1 through 9
{[13.013328]} {[90.000023]} {[36]} {[210.258485]} {[359.016463]} {[15.170686]} {[270.000018]} {[48]} {[265.060631]}
Columns 10 through 15
{[180.778297]} {[14.631801]} {[270.000025]} {[NaN]} {[NaN]} {[7.1234]}
You should watch out for 1.#INF00 and -1.#INF00 which stand for +infinity and -infinity . You could add them to the list to TreatAsEmpty, but then they would come out as NaN instead of as infinities.
If you need to process the #INF00 as infinities then you are going to need to read the file in as text and do text replacements before you do textscan(). You can pass a string to textscan in place of a file identifier in order to process the content of the string.

3 Comments

if you look at the code i attached i did in fact add something like that but it never processes it
The error message shows that your file has 1.#QNAN0 in it. You are only processing -1.#IND00 . I show treating both as empty, and I used your actual data and showed the result of testing on it.
When i try it i still get the same error why could this be and when i get the error it shows me the line and says im using 'EmptyValue', NaN when im actually using 'TreatAsEmpty', {'1.#QNAN0', '-1.#IND00'} like you said

Sign in to comment.

More Answers (1)

I cannot follow your code. However, using frewind repositions the file pointer at the beginning of the file. I would use the fseek function instead.
This is from a previous Answer (that worked) and illustrates the sort of approach I would take with a file such as yours:
st = fseek(fidi, 1, 'bof'); % Position File After First Line
k1 = 1; % Counter
while (st == 0) && (~feof(fidi)) % Test For End Of File Or Unsuccessful ‘fseek’ File Position
data{k1} = textscan(fidi, '%f%f', 'Delimiter','\t', 'HeaderLines',4, 'CollectOutput',1);
st = fseek(fidi, 50, 'cof'); % Position File Pointer To Next Line After Stop
k1 = k1 + 1;
end
You will obviously have to experiment with it to get it to work with your file. I would retain the while conditions, since the combination will prevent an infinite loop if the file does not have a valid end-of-file indicator.

5 Comments

Im totally sure what it is that this code does. There are some variables in there that I don't follow.
I don’t have your file, so I can’t experiment with it.
I don't have the file with me on this computer but its just a file that is 44000X80 of numbers and it has a random string in it -1#IND00 and it can come up anywhere
One option might be to add to your textscan arguments:
'TreatAsEmpty',{'-1#IND00'}
and any other weird strings that exist in your file that you can find and define. Try that first.
If you still have problems, use your textscan format descriptor instead of mine, then see if my code works with your file.
if you look at the code i attached i did in fact add that but it never processes it

Sign in to comment.

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!