Reading a set of numeric values from 100s of .txt files inside a folder

Question

Wander11 on 2 Aug 2022

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1772895-reading-a-set-of-numeric-values-from-100s-of-txt-files-inside-a-folder

Edited: dpb on 3 Aug 2022

I have a folder named SimResults. Inside the folder I have 100s of .txt files. Let the name of file i is of the format "val1_x(i)_val2_y(i)_val3_z(i).txt" . The variables x, y and z varies across different file names. Inside the file i, I have the below text somewhere:

Frame 98 Finished!

Layer 1: DL n_bits = 823200. DL BER = 1.09e-05

Frame 99 Finished!

Layer 1: DL n_bits = 831600. DL BER = 1.08e-05

Frame 100 Finished!

Layer 1: DL n_bits = 840000. DL BER = 1.07e-05

I want to extract data from the line after "Frame 100 Finished! " in every txt file. So in effect, for this text file i, I should obtain a set of values as below

val1(i) = x(i)

val2(i) =y(i)

val3(i) =z(i)

DL_n_bits(i) =840000

DL BER(i)=1.07e-05

Can someone help me sequentially do this for all the txt files and save that data?

2 Comments
Show NoneHide None

dpb on 2 Aug 2022

Are x,y,z numeric or text?

Wander11 on 3 Aug 2022

Numeric

Sign in to comment.

Sign in to answer this question.

Answer 1

Walter Roberson on 3 Aug 2022

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1772895-reading-a-set-of-numeric-values-from-100s-of-txt-files-inside-a-folder#answer_1020185

Open in MATLAB Online

foldername = 'SimResults';
dinfo = dir( fullfile(foldername, '*.txt'));
filenames = {dinfo.name};
nfiles = length(filenames);
val1 = zeros(nfiles,1);
val2 = zeros(nfiles,1);
val3 = zeros(nfiles,1);
DL_n_bits = zeros(nfiles,1);
DL_BER = zeros(nfiles,1);
for K = 1 : nfiles
    thisfilename = filenames{K};
    parts = regexp(thisfilename, '_', 'split');
    x = str2double(parts{2})
    y = str2double(parts{4});
    z = str2double(parts{6});
    S = fileread( fullfile(foldername, thisfilename) );
    info = regexp(S, 'Frame 100 Finished!.*?DL n_bits = (?<bits>\d+.*BER = (?<BER>\S+)', 'once', 'names');
    bits = str2double(info.bits);
    BER = str2double(info.BER);
    val1(K) = x;
    val2(K) = y;
    val3(K) = z;
    DL_n_bits(K) = bits;
    DL_BER(K) = BER;
end

4 Comments
Show 2 older commentsHide 2 older comments

Wander11 on 3 Aug 2022

Thank you!

Walter Roberson on 3 Aug 2022

This code does presume that the bits is integer and the period after is for human reading

Sign in to comment.

Answer 2

dpb on 3 Aug 2022

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1772895-reading-a-set-of-numeric-values-from-100s-of-txt-files-inside-a-folder#answer_1020600

Edited: dpb on 3 Aug 2022

Open in MATLAB Online

Alternatively, just as an experiment, wonder how it would work using some of the more recently introduced features --

foldername = 'SimResults';
d=dir( fullfile(foldername, '*.txt'));
filenames = {dinfo.name};
nfiles = length(filenames);
% here, since we've got the full list of filenames, I'd be tempted to go
% ahead and scan it now for the vals array --
% with the new-fangled string functions (are they as quick as a regexp expression?)
pat="_"+digitsPattern;                                          % to isolate the x,y,z
vals=str2double(extractAfter(extract(filenames,pat),'_'));      % and convert those to numeric
% alternatively, with the old standby -- although it hasn't been internally vecorized
fmt1='val1_%d_val2_%d_val3_%d.txt';
vals=double(cell2mat(cellfun(@(s) cell2mat(textscan(s,fmt)),filenames,'UniformOutput',0)));
% Can try the above on real dataset; with toy set of 10 or so sample
% filenames here, there was no discernible timing difference.
% allocate for the others that have to read files for...
DL_n_bits = zeros(nfiles,1);
DL_BER = zeros(nfiles,1);
fmt2='Layer 1: DL n_bits = %f DL BER = %f';
for K = 1:nfiles
  S=readlines(fullfile(foldername,filenames{K}));
  ix=find(startsWith(S,'Frame 100 Finished!'))+1;
  vals=cell2mat(textscan(S(ix),fmt));
  DL_n_bits(K) = vals(1);
  DL_BER(K) = vals(2);
end

I wonder if it's any quicker to find the particular line and parse it over regexp searching the whole file itself to find the same point in the really long chararacter string -- or how much more overhead the string array introuduces instead???

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Reading a set of numeric values from 100s of .txt files inside a folder

2 Comments
Show NoneHide None

Accepted Answer

4 Comments
Show 2 older commentsHide 2 older comments

More Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Reading a set of numeric values from 100s of .txt files inside a folder

2 Comments Show NoneHide None

Accepted Answer

4 Comments Show 2 older commentsHide 2 older comments

More Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

2 Comments
Show NoneHide None

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments