Removing certain lines of text from an ASCII file loaded into MATLAB

1 view (last 30 days)
I need to remove galaxies that have the first three letters starting with 'ESO'. There are 5,980 of these lines of code I have to remove. What I would do if I needed to impose this same restriction of removing the 'ESO' from the actual ASCII file itself? Because in the ASCII file, it is the 'ESO' followed by some lines of text
>> example:
43|ESO293-027|0.00819|-40.48439|3.9|13.27|1.443|0.465|0.375|~|0.26|0.031|~|-19.87|43.069|6.460|0.28|0.29|
  1 Comment
dpb
dpb on 12 Jun 2015
The format seems perhaps munged by the textflowing tool??? I turned it to a code line at which it appears to be only a single line but your ? says "the 'ESO' followed by some lines of text", plural on the lines emphasized.
The question is, is this so and if so, how many lines and how do you know how many if not a fixed number?
The general answer to the question, however, is to do all the "edits" in memory, the simply rewrite the whole file; sequential files are sequential and there's really no supported method to remove records within them. It can be done under special circumstances and with effort, but it is rarely worth the effort and a file of a few thousands of lines isn't all that big...

Sign in to comment.

Accepted Answer

dpb
dpb on 12 Jun 2015
Edited: dpb on 13 Jun 2015
dat=textread('gillis.txt','%s','delimiter','\n','whitespace',''); % load in cell array one element per line
dat=dat(cellfun(@isempty,strfind(dat,'ESO'))); % save the non-ESO lines
fid=fopen('newgillis.txt','wt');
for i=1:length(dat),fprintf(fid,'%s\n',dat{i});end
fid=fclose(fid);
The above assumes the yet unanswered initial question is that there is only the single line containing 'ESO' not some group of N lines after that each time ESO occurs. If that is the case, then need to save the indices into a temporary logical vector and mung on it to remove those additional locations.
I attached the updated file based on the above...
>> tic,d=dat(cellfun(@isempty,strfind(dat,'ESO')));toc
Elapsed time is 0.001851 seconds.
Don't think Star's concern over runtime will be any problem; this would scale to <20 msec for 10X the number of records.
The result in summary is
>> whos dat d
Name Size Bytes Class Attributes
d 153x1 40666 cell
dat 173x1 46132 cell
20 records were found/eliminated... >>

More Answers (1)

Image Analyst
Image Analyst on 12 Jun 2015
Try this (untested) code:
inputFileName = 'input data.txt';
fidInput = fopen(inputFileName);
if fidInput ~= -1
% Open out[put file
outputFileName = 'output data.txt';
fidOutput = fopen(outputFileName);
% Read input lines.
textLine = fgetl(fidInput);
while ischar(textLine)
disp(textLine)
textLine = fgetl(fidInput);
% See if line starts with ESO
if strcmp(textLine(1:3), 'ESO')
% Line does start with ESO.
% Transfer input line to output file.
fprintf(fidOutput, '%s\n', textLine);
end
end
fclose(fidInput);
fclose(fidOutput);
else
errorMessage = sprintf('Error opening %s for input', inputFileName);
fprintf(1, '%s\n', errorMessage);
uiwait(warndlg(errorMessage));
end
Make changes in the file names, of course, to your names.
  4 Comments
jgillis16
jgillis16 on 12 Jun 2015
For example: 43|ESO293-027|0.00819|-40.48439|3.9|13.27|1.443|0.465|0.375|~|0.26|0.031|~|-19.87|43.069|6.460|0.28|0.29|
65|ESO193-009|0.01479|-47.35683|-0.9|15.03|0.850|0.274|0.221|~|0.26|0.031|~|-19.51|81.236|12.185|0.21|0.22|
The name of each individual ESO is 10 characters long, but the entire line that I need to remove has varying lengths. How would I modify that then?
ESOREMOVAL.m is the code file. My output file is ESO.txt (So I am in the clear?)
Image Analyst
Image Analyst on 12 Jun 2015
Can you attach the file to make it easy for me? When I said textLine(1:3) I assumed ESO was at indexes 1, 2, and 3. If it's really in indexes 4, 5, and 6, then you need to use textLine(4:6). Or if it might be anywhere and you just need to find it somewhere but don't care where, then you can use
if ~isempty(strfind(textLine, 'ESO'))

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!