MATLAB Answers

Textscan with '@' as delimiter

6 views (last 30 days)
AMM
AMM on 7 May 2020
Answered: per isakson on 13 May 2020
I'm working with an inherited script that calls TEXTSCAN as follows:
allData = textscan(fid,'%s','Delimiter','@');
What does the at-sign delimiter parameter do, and is this documented anywhere?
I don't see anything in the TEXTSCAN help for this, but when I parse the same text file with and without that parameter specified, I get different results. The input file contains no explicit at-sign characters anywhere. Is TEXTSCAN treating the @ as some special control character?

  5 Comments

Show 2 older comments
AMM
AMM on 8 May 2020
Thanks, both, for the replies.
Walter, I'm not seeing what you describe—I see effects throughout the input file, not just at the end. If I have a plain-text file that contains no at-signs in it, and I perform the TEXTSCAN call above with and without the 'Delimiter','@' parameter/value arguments, I get significantly different results:
  • with 'Delimiter','@' (trimmed for compactness):
whos allData_withDelim, allData_withDelim(1), allData_withDelim{1},
Name Size Bytes Class Attributes
allData_withDelim 1x1 34684 cell
ans =
1×1 cell array
{133×1 cell}
ans =
133×1 cell array
{' 3.04 N: GNSS NAV DATA M: Mixed RINEX VERSION / TYPE'}
{'XXXXXXX XXXXX XXXX 20200101 123500 UTC PGM / RUN BY / DATE '}
...
  • without 'Delimiter','@' (similarly trimmed; note the CR/LF linebreaks in the last quoted line):
whos allData_noDelim ; allData_noDelim(1), allData_noDelim{1},
Name Size Bytes Class Attributes
allData_noDelim 1x1 21488 cell
ans =
1×1 cell array
{1×1 cell}
ans =
1×1 cell array
{' 3.04 N: GNSS NAV DATA M: Mixed RINEX VERSION / TYPE←↵XXXXXXX XXXXX XXXX 20200101 123500 UTC PGM / RUN BY / DATE ←↵ ...'}
It sure seems like calling TEXTSCAN with the P/V pair 'Delimiter','@' affects its handling of line endings—in other words, it seems to treat the at-sign as a special character, rather than as a literal one. (As I mentioned, this input file contains no at-signs anywhere.)
But I don't see this anywhere in the documentation, and I have no idea what's going on with TEXTSCAN "under the hood." Sorry to be obtuse, but is this possible?
Walter Roberson
Walter Roberson on 9 May 2020
Please attach your data file, and also the code you use to reproduce the problem.
The tests I have done find nothing special about using @ . The effect I get when I use any character not found in the file exactly the same as if I use
textscan(fid, '%s', 'Delimiter', '\n', 'Multiple', true)
or
textscan(fid, '%s', 'whitespace', '\n')
and the effect is:
  • each time the %s fires, skip all leading spaces and newlines
  • once the %s starts reading something non-blank, continue until the first newline
AMM
AMM on 12 May 2020
Hi Walter,
Here you go. Here is what I'm seeing with the attached file:
>> fid=fopen('textscan_test.txt','rt');
>> out1=textscan(fid,'%s'); out1=out1{1}; frewind(fid);
>> out2=textscan(fid,'%s','Delimiter','@'); out2=out2{1};
>> out3=textscan(fid,'%s','whitespace','\n'); out3=out3{1}; fclose(fid);
>> whos
Name Size Bytes Class Attributes
ans 1x1 8 double
fid 1x1 8 double
out1 2700x1 351730 cell
out2 538x1 134220 cell
out3 538x1 134220 cell
As you can see, the attached file contains no at-signs.
Indeed, what seems to be happening is exactly what you describe: if textscan is given a delimiter that doesn't occur in the input, it falls back to the default behavior you mention above.

Sign in to comment.

Accepted Answer

per isakson
per isakson on 13 May 2020
I've reproduced your result on R2018b. The result is according to the textscan documentation - I think.
  • out1 is a cell array of character arrays with one item per cell
  • out2 is a cell array of character arrays with one data row per cell
Case 1. One or more spaces are used as delimiter. That's by default and regardless of the value of 'MultipleDelimsAsOne'. Doc says: If you do not specify a delimiter, then: the delimiter characters are the same as the white-space characters.
Case 2. '@' is used as delimiter. '%s' matches the entire row, since no delimiter is found. (I don't find a sentence in the documentation to copy. There is something about row-oriented that goes without saying.)

  0 Comments

Sign in to comment.

More Answers (0)

Tags

Products


Release

R2020a