Is it possible to use two textscan statements in a row ?

1 view (last 30 days)
Hello,
I am trying to extract some data from a text file.
My text file contains several lines written this way:
I want to extract the parameters after the colons.
Here's my code :
It is supposed to return me C1 = "Mars 2022" and C2 = "16:17:50".
I get the following error message :
It seems like I cannot use two textscan statements in a row. In the workspace, C1 = "Mars 2022" but C2 is an empty cell.
Two additional questions :
  • how to handle the empty string after 'nom' ? I would like that C3 = "".
  • 'affaire' has multiple occurences within my textfile, but I'm only interested in the first one. How to do so ?
Any help would be greatly appreciated,
Gwendal
  2 Comments
Stephen23
Stephen23 on 28 Mar 2022
@Gwendal Marrec: please upload a sample file by clicking the paperclip button.
Gwendal Marrec
Gwendal Marrec on 28 Mar 2022
I've uploaded a copy of the text file I'm working on as well as the lines of code I'm trying to use.

Sign in to comment.

Accepted Answer

Stephen23
Stephen23 on 28 Mar 2022
Edited: Stephen23 on 28 Mar 2022
"Is it possible to use two textscan statements in a row ?"
Yes, you can call TEXTSCAN as many times as you want on one file, I have often used this to read blocks of data separated by intermediate "header" lines. Quite handy, but probably not the best approach for your data file.
str = fileread('textscan_error.txt');
tmp = regexpi(str,'(?<=INFORMATION\s*\{\s*)[^\}]+','once','match');
tkn = regexp(tmp,'(\w+)\s*:\s*"([^"]*)','tokens');
tkn = vertcat(tkn{:}).';
out = struct(tkn{:})
out = struct with fields:
date: 'Mars 2022' heure: '16:17:50' nom: '' affaire: 'stage' protection: '_'
out.heure
ans = '16:17:50'
out.affaire
ans = 'stage'
If you really want to use TEXTSCAN:
opt = {};
fmt = '%s:%q';
[fid,msg] = fopen('textscan_error.txt','rt');
assert(fid>0,msg)
str = 'X';
while ~startsWith(str,'INFORMATION')
str = fgetl(fid);
end
tmp = textscan(fid,fmt,opt{:});
fclose(fid);
tmp{:}
ans = 6×1 cell array
{'date' } {'heure' } {'nom' } {'affaire' } {'protection'} {'}' }
ans = 5×1 cell array
{'Mars 2022'} {'16:17:50' } {0×0 char } {'stage' } {'_' }
  1 Comment
Gwendal Marrec
Gwendal Marrec on 30 Mar 2022
Thank you very much ! Your first solution helped me a lot, I've modified it a bit so now it fits perfectly to my real datafile.
Have a nice day,
Gwendal

Sign in to comment.

More Answers (0)

Categories

Find more on Text Data Preparation in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!