reading only a portion of a text file

30 views (last 30 days)
Jeff
Jeff on 5 Apr 2014
Edited: Jeff on 6 Apr 2014
I would like to read in a text file that contains a header and footer of information, where the number of rows of the header/footer, and number of rows to read can vary.
Here is an example of a row of data I would like to read from the text file.
All rows start with the ARR++ and are delineated with ':' I now I need i most likely need to use fopen / fprintf / fget1 / textscan but hoping someone can help set this up.
One other thing with the rows of data I would like to read in, there is date information like: 2013320133. I would ideally like to read only the first 5 digits of that date and separate the year and quarter into separate columns -- 20133 --> 2013 3
Here is a better example of the type of file I would like to read in. I am only interested in the ARR++ lines. I would be interested to have only the first 5 digits of 2013420134. Thanks a lot.
UNA:+.? ' UNB+UNC:140305:1444++' UNH+:2:1:E6' BGM+74' NA+Z02+' NAD+M+50' ND+MS+C2' STS+3+7' DM+242:20144:203' GI+AR3' GS+1:::-' ARR++Q:S:C:A:1N:2013420134:708:1234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:12234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:133234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:132234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123334.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:1232134.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123324.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123234.323:A:N' UNT+16+' UNZ+1+I3800'
  5 Comments
Image Analyst
Image Analyst on 5 Apr 2014
In the past 6 hours, have you at least given fgetl() or textscan() a try ? Or do you really really need us to do it 100% for you?
Jeff
Jeff on 5 Apr 2014
Edited: Jeff on 5 Apr 2014
Image you ok today? Listen if you are annoyed by beginner questions then don't bother posting anything.

Sign in to comment.

Accepted Answer

Jeff
Jeff on 6 Apr 2014
fid = fopen('mydata.txt', 'r'); tline = fgetl(fid); k=1;
while ischar(tline) disp(tline) findrows = strfind(tline, 'ARR++'); if ~isempty(findrows) Data{k,:} = tline(:); k=k+1; end tline = fgetl(fid); end
fclose(fid);
filename = 'mydata_1.txt';
%this function converts the cell array back to a text file. found here: cell array to text file file;
cell2text(filename,Data);
%open file fid = fopen(filename, 'r'); %figure out how many columns are there firstline = fgetl(fid); ncol = 1 + sum(firstline == ':'); %reset to beginning of file fseek(fid,0,0); %read data data = textscan(fid,repmat('%s',1,ncol),'Delimiter',':','CollectOutput',1); data = data{:,:}; data = data(:,[3:7 9:12 14 16]);
[m,n]=size(data);
for i = 1:m data1{i,9} = data{i,9}(1:5); data{i,9} = {}; data{i,9} = data1{i,9}; end
%close file fclose(fid);

More Answers (2)

Image Analyst
Image Analyst on 5 Apr 2014
OK Jeff I did it for you. It just took a couple of minutes. I copied the data you gave to a test.dat file. Then I wrote code to read it in using fgetl() and search for lines that start with "ARR++Q:S:C:A:1N:" based on code I got in the help for fgetl. Then I extracted the 5 numerical characters from the string and converted it to a double number. Here is the code for you:
fid = fopen('test.dat');
tline = fgetl(fid);
k = 1; % Counter for lines that are valid.
while ischar(tline)
disp(tline)
colonLocation = strfind(tline, 'ARR++Q:S:C:A:1N:');
if ~isempty(colonLocation)
subString = tline(17:21);
output(k) = str2double(subString);
k = k + 1;
end
tline = fgetl(fid);
end
fclose(fid);
% Print output to command window:
output
Results in the command window:
output =
20134 20134 20134 20134 20134 20134 20134 20134
  4 Comments
Image Analyst
Image Analyst on 5 Apr 2014
Just add a line
outputStrings{k} = tline;
to save the entire line also.
Jeff
Jeff on 6 Apr 2014
Edited: Jeff on 6 Apr 2014
Hi again Image,
I have to admit I wasn't successful using your code; not sure why but I don't get any output returned.
ANYWAYS you forced me to keep playing around and 1 day later :) I think i figured out a nice solution. My goal (for no particular reason but for just learning purposes) was to create a generic function that will trim out the header and footer of any file, take the remaining rows needed and then "text - to - column" the data taking only the columns needed and replacing the 10 digit date with only 5 digits.

Sign in to comment.


Jeff
Jeff on 6 Apr 2014
Edited: Jeff on 6 Apr 2014
fid = fopen('mydata.txt', 'r'); tline = fgetl(fid); k=1;
while ischar(tline) disp(tline) findrows = strfind(tline, 'ARR++'); if ~isempty(findrows) Data{k,:} = tline(:); k=k+1; end tline = fgetl(fid); end
fclose(fid);
filename = 'mydata_1.txt';
%this function converts the cell array back to a textfile; found here: cell array to text cell2text(filename,Data);
%open file fid = fopen(filename, 'r'); %figure out how many columns are there firstline = fgetl(fid); ncol = 1 + sum(firstline == ':'); %reset to beginning of file fseek(fid,0,0); %read data data = textscan(fid,repmat('%s',1,ncol),'Delimiter',':','CollectOutput',1); data = data{:,:}; data = data(:,[3:7 9:12 14 16]);
[m,n]=size(data);
for i = 1:m data1{i,9} = data{i,9}(1:5); data{i,9} = {}; data{i,9} = data1{i,9}; end
%close file fclose(fid);

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!