Read from text file between header and footer

8 views (last 30 days)
Is there an easy way to read numerical data from a text file below a 3 line header and above a footer starting with a sequence of asterisks. The data is in 3 complete columns of unknown length. Delimiter and asterisks can be modified if necessary.
Tried readtable but footer is causing problems as there doesn't appear to be an option from detectImportOptions.
see attached for example data file
ld dim real
ratio ht speed
[] [s] [m/s]
5.0000000e-01 3.4497665e+02 2.1712214e+03
1.0000000e+00 3.8184070e+02 2.3923590e+03
1.5000000e+00 3.9916530e+02 2.4919967e+03
2.0000000e+00 4.0763144e+02 2.5342191e+03
2.5000000e+00 4.1118036e+02 2.5473084e+03
...
**********************************************
2.8682373e+00 4.1181815e+02
2.6080322e+00 2.5476604e+03

Accepted Answer

Star Strider
Star Strider on 16 Oct 2023
Edited: Star Strider on 16 Oct 2023
One option is textscan
type('data.txt')
ld dim real ratio ht speed [] [s] [m/s] 5.0000000e-01 3.4497665e+02 2.1712214e+03 1.0000000e+00 3.8184070e+02 2.3923590e+03 1.5000000e+00 3.9916530e+02 2.4919967e+03 2.0000000e+00 4.0763144e+02 2.5342191e+03 2.5000000e+00 4.1118036e+02 2.5473084e+03 3.0000000e+00 4.1174642e+02 2.5434127e+03 3.5000000e+00 4.1034564e+02 2.5274253e+03 4.0000000e+00 4.0753190e+02 2.5019786e+03 4.5000000e+00 4.0361995e+02 2.4695291e+03 5.0000000e+00 3.9881033e+02 2.4323257e+03 5.5000000e+00 3.9323845e+02 2.3920574e+03 6.0000000e+00 3.8701946e+02 2.3499201e+03 6.5000000e+00 3.8029476e+02 2.3068784e+03 7.0000000e+00 3.7327720e+02 2.2638295e+03 7.5000000e+00 3.6629421e+02 2.2215880e+03 8.0000000e+00 3.5957346e+02 2.1807905e+03 8.5000000e+00 3.5315798e+02 2.1418418e+03 9.0000000e+00 3.4706390e+02 2.1049334e+03 9.5000000e+00 3.4130076e+02 2.0701014e+03 1.0000000e+01 3.3586329e+02 2.0372850e+03 1.0500000e+01 3.3073554e+02 2.0063724e+03 1.1000000e+01 3.2589619e+02 1.9772286e+03 ********************************************** 2.8682373e+00 4.1181815e+02 2.6080322e+00 2.5476604e+03
fidi = fopen('data.txt','rt')
fidi = 3
C = textscan(fidi, '%f%f%f', 'HeaderLines',3, 'CollectOutput',1)
C = 1×1 cell array
{22×3 double}
fclose(fidi);
A = cell2mat(C)
A = 22×3
1.0e+03 * 0.0005 0.3450 2.1712 0.0010 0.3818 2.3924 0.0015 0.3992 2.4920 0.0020 0.4076 2.5342 0.0025 0.4112 2.5473 0.0030 0.4117 2.5434 0.0035 0.4103 2.5274 0.0040 0.4075 2.5020 0.0045 0.4036 2.4695 0.0050 0.3988 2.4323
figure
plot(A(:,1), A(:,[2 3]))
grid
It will stop automatically at the line of asterisks, however if you want to re-start it to read the last two lines, that is an option.
EDIT — Forgot fclose call, now added.
.
  4 Comments
Brantosaurus
Brantosaurus on 16 Oct 2023
That was very kind of you.
It would perhaps be better to insert NaN's for blanks.
I can see you are checking for the end of file as things proceed through the 'header' and 'footer'. Breaking when an empty matrix appears.
One more question if i may?
In laymens terms, what exactly is fseek(fidi, 0 , 0) doing?
Star Strider
Star Strider on 16 Oct 2023
Thank you.
The fseek call moves to the next position in the file, beyond the non-numeric text tthat stopped textscan.
If you have more of these files, use readtable with fixedWidthImportOptions, as I did here, in the last part of my previous Comment. That includes inserting the NaN value in the blanks.
Reprising that section here —
opts = fixedWidthImportOptions('NumVariables',3, 'VariableWidths',[15 16 16], 'DataLines',[4 25; 27 28]);
T1 = readtable('data.txt', opts)
T1 = 24×3 table
Var1 Var2 Var3 _________________ _________________ _________________ {'5.0000000e-01'} {'3.4497665e+02'} {'2.1712214e+03'} {'1.0000000e+00'} {'3.8184070e+02'} {'2.3923590e+03'} {'1.5000000e+00'} {'3.9916530e+02'} {'2.4919967e+03'} {'2.0000000e+00'} {'4.0763144e+02'} {'2.5342191e+03'} {'2.5000000e+00'} {'4.1118036e+02'} {'2.5473084e+03'} {'3.0000000e+00'} {'4.1174642e+02'} {'2.5434127e+03'} {'3.5000000e+00'} {'4.1034564e+02'} {'2.5274253e+03'} {'4.0000000e+00'} {'4.0753190e+02'} {'2.5019786e+03'} {'4.5000000e+00'} {'4.0361995e+02'} {'2.4695291e+03'} {'5.0000000e+00'} {'3.9881033e+02'} {'2.4323257e+03'} {'5.5000000e+00'} {'3.9323845e+02'} {'2.3920574e+03'} {'6.0000000e+00'} {'3.8701946e+02'} {'2.3499201e+03'} {'6.5000000e+00'} {'3.8029476e+02'} {'2.3068784e+03'} {'7.0000000e+00'} {'3.7327720e+02'} {'2.2638295e+03'} {'7.5000000e+00'} {'3.6629421e+02'} {'2.2215880e+03'} {'8.0000000e+00'} {'3.5957346e+02'} {'2.1807905e+03'}
T1(end-1:end,:)
ans = 2×3 table
Var1 Var2 Var3 _________________ _________________ _________________ {'2.8682373e+00'} {'4.1181815e+02'} {0×0 char } {'2.6080322e+00'} {0×0 char } {'2.5476604e+03'}
% missing = varfun(@ismissing, T1);
% missing(end-1:end,:)
T2 = varfun(@(x)fillmissing(x,'constant',{'NaN'}), T1)
T2 = 24×3 table
Fun_Var1 Fun_Var2 Fun_Var3 _________________ _________________ _________________ {'5.0000000e-01'} {'3.4497665e+02'} {'2.1712214e+03'} {'1.0000000e+00'} {'3.8184070e+02'} {'2.3923590e+03'} {'1.5000000e+00'} {'3.9916530e+02'} {'2.4919967e+03'} {'2.0000000e+00'} {'4.0763144e+02'} {'2.5342191e+03'} {'2.5000000e+00'} {'4.1118036e+02'} {'2.5473084e+03'} {'3.0000000e+00'} {'4.1174642e+02'} {'2.5434127e+03'} {'3.5000000e+00'} {'4.1034564e+02'} {'2.5274253e+03'} {'4.0000000e+00'} {'4.0753190e+02'} {'2.5019786e+03'} {'4.5000000e+00'} {'4.0361995e+02'} {'2.4695291e+03'} {'5.0000000e+00'} {'3.9881033e+02'} {'2.4323257e+03'} {'5.5000000e+00'} {'3.9323845e+02'} {'2.3920574e+03'} {'6.0000000e+00'} {'3.8701946e+02'} {'2.3499201e+03'} {'6.5000000e+00'} {'3.8029476e+02'} {'2.3068784e+03'} {'7.0000000e+00'} {'3.7327720e+02'} {'2.2638295e+03'} {'7.5000000e+00'} {'3.6629421e+02'} {'2.2215880e+03'} {'8.0000000e+00'} {'3.5957346e+02'} {'2.1807905e+03'}
T2(end-1:end,:)
ans = 2×3 table
Fun_Var1 Fun_Var2 Fun_Var3 _________________ _________________ _________________ {'2.8682373e+00'} {'4.1181815e+02'} {'NaN' } {'2.6080322e+00'} {'NaN' } {'2.5476604e+03'}
format longG
A = str2double(table2array(T2))
A = 24×3
1.0e+00 * 0.5 344.97665 2171.2214 1 381.8407 2392.359 1.5 399.1653 2491.9967 2 407.63144 2534.2191 2.5 411.18036 2547.3084 3 411.74642 2543.4127 3.5 410.34564 2527.4253 4 407.5319 2501.9786 4.5 403.61995 2469.5291 5 398.81033 2432.3257
A(end-1:end,:)
ans = 2×3
1.0e+00 * 2.8682373 411.81815 NaN 2.6080322 NaN 2547.6604
You would need to manually keep track of the number of lines after the asterisks line (here the last two), and make approipriate changes to the 'NumVariables', 'VariableWidths', and 'DataLines' name-value pairs in the fixedWidthImportOptions call, however that should be straightforward.
I no longer have access to R2017a, and while there used to be online documentation for a few previous releases, that appears to no long be an option. Everything I have listed here should be available in — and compatible with — R2017a, however I cannot check to be certain.
.

Sign in to comment.

More Answers (0)

Products


Release

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!