Frequency of unique items per month from timetable data
5 views (last 30 days)
Show older comments
I have a timetable which is read into matlab from an excel sheet with 2 columns: Timestamp and Stat. Stat is a fault code from a machine. Example below-
2120×1 timetable
Timestamp Stat
____________________ ____
24-Jan-2024 11:47:31 0
23-Jan-2024 15:32:32 9999
23-Jan-2024 00:03:41 77
23-Jan-2024 00:00:13 294
22-Jan-2024 10:08:14 0
: :
11-Jan-2018 10:37:42 -20
10-Jan-2018 08:40:38 9999
10-Jan-2018 08:40:16 9998
10-Jan-2018 08:37:33 9999
10-Jan-2018 08:37:30 -21
05-Jan-2018 14:57:37 -20
05-Jan-2018 13:05:53 9999
05-Jan-2018 12:09:39 159
Display all 2120 rows.
I would like to find the frequency of each fault code per month however there are over 100 possible fault codes but usually only 5/6 unique codes occuring multiple times each month so it seems wasteful to have a bin for every possible fault code as this would result in a lot of empty bins.
Is there a way to calculate the frequency of each fault code on a monthly basis based on the individual fault codes appearing in each month? I have considered groupcount and histcount functions but struggled to achieve my objective using either of these based on the Matlab documentation.
0 Comments
Accepted Answer
VINAYAK LUHA
on 30 Jan 2024
Hi Paddy,
From what I gather, you possess a timetable containing two columns: 'Timestamp' and 'Stat'. The 'Stat' column contains fault codes that range up to 100 unique values, though typically you only observe around 5 to 6 distinct codes in any given month.
Further, you're looking forward to calculate the monthly frequency for each fault codes in an efficient manner that avoids creating excessive bins for codes that don't appear.
As a workaround, you may use a data structure of nested maps, with the primary map keyed by year-month and the secondary map tracking the counts of fault codes for that specific year and month.
This approach has better space complexity but worse time complexity than your suggested solution.
Here's the code snippet for the above mentioned method for your understanding
% Initialize the map to hold the frequency data
faultFrequencyMap = containers.Map('KeyType', 'char', 'ValueType', 'any');
% Loop through each row of the timetable
for i = 1:height(tt)
% Extract the year and month as a string key (e.g., '2024-01')
yearMonthKey = datestr(tt.Timestamp(i), 'yyyy-mm');
% Check if this year-month key already exists in the map
if isKey(faultFrequencyMap, yearMonthKey)
% If the key exists, retrieve the existing fault code map
monthMap = faultFrequencyMap(yearMonthKey);
else
% If the key does not exist, create a new map for fault codes
monthMap = containers.Map('KeyType', 'int32', 'ValueType', 'int32');
faultFrequencyMap(yearMonthKey) = monthMap;
end
% Get the fault code for this row
faultCode = tt.Stat(i);
% Update the fault code frequency in the monthMap
if isKey(monthMap, faultCode)
% If the fault code exists, increment the count
monthMap(faultCode) = monthMap(faultCode) + 1;
else
% If the fault code does not exist, initialize the count to 1
monthMap(faultCode) = 1;
end
% Update the monthMap in the faultFrequencyMap
faultFrequencyMap(yearMonthKey) = monthMap;
end
% Display the faultFrequencyMap
keys = faultFrequencyMap.keys;
for k = 1:length(keys)
disp(['Year-Month: ', keys{k}]);
monthMap = faultFrequencyMap(keys{k});
faultCodes = monthMap.keys;
for f = 1:length(faultCodes)
disp([' Fault Code: ', num2str(faultCodes{f}), ...
', Frequency: ', num2str(monthMap(faultCodes{f}))]);
end
end
Hope this answers your query
Regards
Vinayak luha
More Answers (1)
Eric Sofen
on 27 Feb 2024
I think groupcounts will also solve your problem (it would be helpful to have some sample data in the question in a form I could use directly rather than recreating it):
groupcounts(t,["Timestamp","Stats"],"month")
0 Comments
See Also
Categories
Find more on Data Preprocessing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!