load a file and sort out bad data.
1 view (last 30 days)
Show older comments
I have a text file like shown below, that I want to load in to matlab. The format of the file should be like: [temp rate type], but if some of the data is not in correct format i want them sorted out. For example a file could look like this:
25 0.109 1
corrupted line
20 0.096 2
100 0 3
15 0.517 3
15 0.8 5
35 1.086 4
3
40 0.934 2
3 3 3 3 3 3
35 0.109 1
22 0.100 1
21 0.100 2
16 0.600 3
17 0.850 5
32 1.080 4
45 0.950 2
37 0.110 1
Any letters and lines that are not in correct format ( 3 numbers per line) should be ignored, when loading the file. How can I do this? I have tried different combinations of fgetl(), scanf(), and textscan(), but I am not sure how to do it. Please help me if you can.
Adam
0 Comments
Accepted Answer
Jason Nicholson
on 16 Jun 2014
Edited: Jason Nicholson
on 17 Jun 2014
You should use the lower level fgetl which gets one line at a time as a string. You can then parse that line in a while loop.
Use this code or the m file attached to parse the data you listed in your post.
% adjust this if you know the maximum array size in advance
bufferSize = 1e4;
data = zeros(bufferSize, 3);
% use this for more flexibility
%DELIMITERS = '\t , ;'; % this is tab, space, comma, or semicolon as delimiter
numberOfGoodLinesSaved = 0;
fileID = fopen('data.txt', 'r');
testForNextLine = ~feof(fileID);
while (testForNextLine )
currentLine = fgetl(fileID);
% parse current line
currentLineNumerical = sscanf(currentLine, '%f');
% use this line instead if you need the DELIMITER to vary or you want
% more flexibility
% currentLineNumerical = sscanf(currentLine, ['%f%*[' DELIMITERS ']');
% check if current line is valid
if ~isempty(currentLineNumerical) && numel(currentLineNumerical) == 3
numberOfGoodLinesSaved = numberOfGoodLinesSaved + 1;
data(numberOfGoodLinesSaved,:) = currentLineNumerical;
else
% do nothing
end
% check buffer size is big enough. double the buffer size if needed
if numberOfGoodLinesSaved == bufferSize
bufferSize = 2*bufferSize;
data(numberOfGoodLinesSaved+1:bufferSize,:) = 0;
end
testForNextLine = ~feof(fileID);
end % end while loop
% delete extra lines in buffer
data(numberOfGoodLinesSaved+1:end,:) = [];
fclose(fileID);
0 Comments
See Also
Categories
Find more on Large Files and Big Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!