Readtable Delimiters on two similar files gives differing result
2 views (last 30 days)
Show older comments
Is there any reason why using Readtable to open the following 2 csv files produces different results
Im using readtable as it has the ability to auto detect how many lines to skip, and generally works well - except for the case above and I can't see why. My aim is to get the real data into a uitable
try
[file,folder]=uigetfile({'*.csv';'*.xls'},'Open Image',app.startfolder);
catch
[file,folder]=uigetfile({'*.csv';'*.xls'},'Open Image','C:\');
end
fullpath=fullfile(folder,file);
app.startfolder=folder;
T = readtable(fullpath,MissingRule="omitrow",Delimiter=","); %Delimiter="tab"
app.UITable.Data=table2array(T);
This is what I am seeing:
I have tried omitting the Delimiter option in readtable, but with no luck
(Note my header files can be different which is why I want to try and avoid skipping " a known number of " rows.)
4 Comments
Eric Sofen
on 17 Jun 2024
If the header structure is consistent between files, the NumHeaderLines argument in readtable, will help to start parsing the CSV from the right line and not get tripped up by the commas in the date line.
Accepted Answer
Voss
on 12 Jun 2024
One problem seems to be that the date/time line in the header has 3 commas in it, which for file B causes readtable to try to treat that line as part of the data section since there are also 3 commas per line there. (The data section in file A has 5 commas per line, so the date/time line is not confused for data in that case.) I couldn't find a way around that using various options in readtable/readmatrix (but I didn't try very hard - there may well be a way to do it).
One solution is to write your own reading function. I've written one such function (read_this_file), and it's given below.
type('A.txt') % show file contents for reference
MA = read_this_file('A.txt') % get numeric matrix data
type('B.txt') % show file contents for reference
MB = read_this_file('B.txt') % get numeric matrix data
function M = read_this_file(F)
% read lines of file F into string array S
S = readlines(F);
% keep the line after the one that starts with 'Point', and all
% the lines after that, and replace the commas with spaces
S = strrep(S(find(startsWith(S,'Point'),1)+1:end),',',' ');
% run sscanf(_'%f') on each line, capturing the numbers they contain
C = arrayfun(@(s)sscanf(s,'%f'),S,'UniformOutput',false);
% put those numbers into a matrix with the correct orientation
M = [C{:}].';
end
4 Comments
Voss
on 12 Jun 2024
Edited: Voss
on 12 Jun 2024
Here's a modification to read_this_file that also optionally returns the column names, so you can use them in the uitable.
[MA,HA] = read_this_file('A.txt') % get numeric matrix data
f = figure('Position',[1 1 510 120]);
t = uitable(f,'Position',[10 10 490 100]);
t.Data = MA;
t.ColumnName = HA;
[MB,HB] = read_this_file('B.txt') % get numeric matrix data
f = figure('Position',[1 1 510 120]);
t = uitable(f,'Position',[10 10 490 100]);
t.Data = MB;
t.ColumnName = HB;
function [M,H] = read_this_file(F)
% read lines of file F into string array S
S = readlines(F);
% find the line that starts with 'Point'
idx = find(startsWith(S,'Point'),1);
% if column names were requested, take them from this line
if nargout > 1
H = strtrim(split(S(idx),','));
end
% keep all the lines after that line, and replace the commas with spaces
S = strrep(S(idx+1:end),',',' ');
% run sscanf(_'%f') on each line, capturing the numbers they contain
C = arrayfun(@(s)sscanf(s,'%f'),S,'UniformOutput',false);
% put those numbers into a matrix with the correct orientation
M = [C{:}].';
end
See Also
Categories
Find more on Environment and Settings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!