Observing Error while reading .csv files
4 views (last 30 days)
Show older comments
I am facing an issue of extracting data from the .csv files. The program will read the files from several folders inside a main folder. Sample repetitive contents of the csv files can be as below:
key operation statistics
session time (sec) 1
packet try count 18
packet succeed count 17
rx receptions 1
packet-1 success/failure count 18/0
packet-1 success/failure count 18/0
Signal Strength of packet-1 -52.5
Signal Strength of packet-2 -53.5
key operation statistics
session time (sec) 5
packet try count 32
packet succeed count 31
rx receptions 1
packet-1 success/failure count 28/0
packet-1 success/failure count 28/0
Signal Strength of packet-1 -59.5
Signal Strength of packet-2 -60
%The coloumn 1 has alphabetic description and coloumn 2 contains all corresponding data in each csv file.
%The global TX power value is set to -20 dBm. Otherwise, one csv_power value files are saved to detect the TX value such as 6.csv_power file. The code will understand then that 6 dB is the TX value.
My main program file is as below:
%Variables declared initially
MeasurementFolder = 'Name of the main folder where the CSV files are'; %matlab files are kept with the main folder
subFolders = GetSubDirsFirstLevelOnly(MeasurementFolder); %quite not sure this line can perfectly read all the files inside a sub folder
if exist([MeasurementFolder, '/allMeasurements.mat'], 'file')
load ([MeasurementFolder, '/allMeasurements.mat'])
else
allMeasurements = struct(); % make a struct to contain everything
for s = 1:length(subFolders)
allMeasurements(s).data = struct();
allMeasurements(s).name = [MeasurementFolder, '/', subFolders{s}];
allMeasurements(s).inputCSVFileNames = char(dir([allMeasurements(s).name, '/*.csv']).name);
allMeasurements(s).TX_power = str2double(extractBefore(string(dir([allMeasurements(s).name, '/*.csv_power']).name), '.csv_power'));
if isnan(allMeasurements(s).TX_power)
allMeasurements(s).TX_power = TX_power; % saving global value (from variables) if no specific one
end
end
for n = 1:size({allMeasurements.name},2)
allMeasurements(n).name;
allMeasurements(n).data = extractInfoFromCSV(allMeasurements(n).data, allMeasurements(n).inputCSVFileNames, allMeasurements(n).name);
end
for n = 1:size({allMeasurements.name},2)
allMeasurements = plotSS(allMeasurements, n, figureDirection, prctileLow, prctileHigh, filteringParameter);
end
%(the code continues)
The function which reads the info from the files:
function [outputStruct] = extractInfoFromCSV(inputStruct, inputCSVFileNames, inputFolder)
outputStruct = inputStruct;
keyline_stats = "key operation statistics";
keyline_session_time = " session time (sec)";
keyline_packet_try_count = " packet try count ";
keyline_packet_succeed_count = " packet succeed count ";
keyline_packet1_sf = " packet-1 success/failure count ";
keyline_packet2_sf = " packet-2 success/failure count ";
keyline_packet1_strength = " Signal Strength of packet-1 ";
keyline_packet2_strength = " Signal Strength of packet-2 ";
varTypes = {'double','double','double','double','double','double','double','double','double'};
varNames = {'session time (sec)','packet try count','packet succeed count','packet-1 success count','packet-1 failure count','packet-2 success count','packet-2 failure count','Signal Strength of packet-1','Signal Strength of packet-2'};
% from packet-1 success/failure count and packet-2 success/failure count, we will separate the success and fail count which are divided by /. then the values will be stored in the separate coloumns as ok and nok count. In each csv files, the success and failure count is the cumulative value of the previous counts. So the last value of the packet-1 success/failure count will only be stored, and all the previous values of the same csv file will be ignored.
for n = 1:size(inputCSVFileNames,1)
lines = readlines([inputFolder, '/',inputCSVFileNames(n,:)]);
T_temp = readtable([inputFolder, '/', inputCSVFileNames(n,:)], 'Delimiter', ',');
for ci =1:numel(lines)
if strcmp(keyline_stats, (lines(ci)))
T_temp.("session time (sec)")(ci+1) = str2double(extractAfter(lines(ci+1),keyline_session_time));
T_temp.("packet try count")(ci+1) = str2double(extractAfter(lines(ci+2),keyline_packet_try_count));
T_temp.("packet succeed count")(ci+1) = str2double(extractAfter(lines(ci+3),keyline_packet_succeed_count));
T_temp.("packet-1 success count")(ci+1) = str2double(extractBetween(lines(ci+5),keyline_packet1_sf, "/"));
T_temp.("packet-1 failure count")(ci+1) = str2double(extractAfter(extractAfter(lines(ci+5),"/"),"/"));
T_temp.("packet-2 success count")(ci+1) = str2double(extractBetween(lines(ci+6),keyline_packet2_sf, "/"));
T_temp.("packet-2 failure count")(ci+1) = str2double(extractAfter(extractAfter(lines(ci+6),"/"),"/"));
T_temp.("Signal Strength of packet-1")(ci+1) = str2double(extractAfter(lines(ci+7),keyline_packet1_strength));
T_temp.("Signal Strength of packet-2")(ci+1) = str2double(extractAfter(lines(ci+8),keyline_packet2_strength));
end
T_filtered = T_temp(T_temp.("session time (sec)")~=0,:);
end
outputStruct(n).table = T_filtered;
outputStruct(n).name = strcat('distance_', extractBefore(string(inputCSVFileNames(n,:)), '.'));
outputStruct(n).distances = str2double(extractBefore(string(inputCSVFileNames(n,:)), 'm'));
outputStruct(n).packet1_success_last = T_filtered.("packet-1 success count")(end);
outputStruct(n).packet1_failure_last = T_filtered.("packet-1 failure count")(end);
outputStruct(n).packet2_success_last = T_filtered.("packet-2 success count")(end); % save distances from file names
outputStruct(n).packet2_failure_last = T_filtered.("packet-2 success count")(end); % save distances from file names
end
end
There are several other functions which are not being shared here for the ease of the replier.
Error observed: Unrecognized field name "name".
Error in main_file (line 45)
for n = 1:size({allMeasurements.name},2)
What can I do to eliminate the error?
Answers (2)
Stephen23
on 15 Mar 2024
Edited: Stephen23
on 16 Mar 2024
Your code is fragile and complex. You should aim to simplify it and make it more generalised:
- avoid having lots of hard-coded text values,
- avoid cramming too much onto one line of code,
- avoid CHAR() in lists of filenames,
- use FULLFILE instead of concatenating paths with hard-coded path separators.
For example:
P = '.'; % absolute or relative path to where the file is saved
F = '500m.csv';
C = readcell(fullfile(P,F), 'Delimiter',',')
X = startsWith(C(:,1),'key');
Y = cellfun(@ischar,C(:,2));
T = regexp(C(Y,2),'(\d+)/(\d+)','tokens','once');
C(Y,2) = num2cell(str2double(vertcat(T{:})),2);
T = cell2table(C);
T.Group = cumsum(X);
U = unstack(T,'C2','C1', 'VariableNamingRule','preserve');
U = convertvars(U,@iscell,@(c)vertcat(c{:}))
Note that you can easily split the success/failure matrices into column vectors:
U = splitvars(U)
And also give those columns better names:
V = U.Properties.VariableNames;
H = @(idx,varargin) varargin{str2double(idx)};
V = regexprep(V,'(\w+)/(\w+)\s*(\w+)_(\d+)$','${H($4,$1,$2)} $3');
U.Properties.VariableNames = V
0 Comments
Pratyush Swain
on 14 Mar 2024
Edited: Pratyush Swain
on 15 Mar 2024
Hi Mohaiminul,
Please ensure the function "GetSubDirsFirstLevelOnly" function correctly retrieves the names of the first-level subdirectories as you might be retreving empty subfolders due to which the "allMeasurements" struct will not contain any "data" field and hence produce the error.
You can follow the given implementation:
function [subDirsNames] = GetSubDirsFirstLevelOnly(parentDir)
% Get a list of all files and folders in this folder.
files = dir(parentDir);
names = {files.name};
% Get a logical vector that tells which is a directory.
dirFlags = [files.isdir] & ...
~strcmp(names, '.') && ~strcmp(names, '..');
% Extract only those that are directories.
subDirsNames = names(dirFlags);
end
Also you can specify error handling mechanisms like a check to ensure "allMeasurements" is not empty before entering the loop.
if ~isempty(allMeasurements)
for n = 1:size({allMeasurements.name},2)
% Your loop implementation
end
else
warning('No measurements found.');
end
For more information, please go through the following references:
Hope this helps.
EDIT: Previous answer contained "GetSubDirsFirstLevelOnly" function implementation which is not a general solution and should be avoided, it has been corrected now.
2 Comments
Stephen23
on 15 Mar 2024
Edited: Stephen23
on 15 Mar 2024
Note that this code:
subDirsNames = {subDirs(3:end).name};
is a buggy attempt to avoid the dot-directories. This topic has been extensively discussed and explained previously on this forum:
https://www.mathworks.com/matlabcentral/answers/1699230-folder-listing-with-dir-on-mac#answer_945260
As Walter Roberson wrote: "That code is wrong on every operating system that MATLAB has ever run on".
The recommended approach is to specify a non-empty DIR search name, or use e.g. ISMEMBER, SETDIFF, or similar:
Pratyush Swain
on 15 Mar 2024
Thanks Stephen for pointing this out !! I missed your comment pointing out the same thing in the original thread, have edited the function now.
See Also
Categories
Find more on File Operations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!