Combine multiple varying length double arrays into one timetable

41 views (last 30 days)
I have a .mat file with multiple (a few thousand) individual 2-column double arrays, all named for a different variable.
These arrays are of varying length (from a few to a few million rows), but for each of them the first column always is the timestamp in datenum format, the second column is the data.
I would like to get them into a single timetable where each column is the data from each array, with the column name taken from the variable name.
e.g.
log.mat
> Var1
> Time
> Data
> Var2
> Time
> Data
...
> VarN
> Time
> Data
into
Workspace
> TT
> Time
> Var1
> Var2
...
> VarN
I assume there has to be a better way of doing this than iteratively calling each variable from the .mat file, concatentating horizontally into one huge array, and then following this guide to convert it to a timetable?
  1 Comment
dpb
dpb ungefär en timme ago
Pretty much @Voss gave the way; the problem is you have variable-length data arrays and a table or timetable must have same number of elements for each variable. This means you'll have to pad the shorter to the length of the longer.
And, unless the various variables also have consistent tmestamps, you'll also have to then synchronize the times to put the same variables at the same time in the table.
You can avoid having to write each variable name directly by using the functional form for load and return the variables as a struct, then you can iterate over the fieldnames in the stuct.

Sign in to comment.

Answers (2)

Stephen23
Stephen23 on 20 Dec 2024 at 15:00
Edited: Stephen23 on 20 Dec 2024 at 16:21
"I assume there has to be a better way of doing this than iteratively calling each variable..."
Note that synchronize accepts 1..N input timetables, which means you can synchronize all of them in one go. And that is probably what you should do, using a simple comma-separated list.
Fake data:
F1 = @(n) [datenum(2024,12,20,1,2,fix(600*rand(n,1))/10),randi(9,n,1)];
S.Var1 = F1(13);
S.Var2 = F1(1e2);
S.Var3 = F1(23)
S = struct with fields:
Var1: [13x2 double] Var2: [100x2 double] Var3: [23x2 double]
Of course in your actual code you would LOAD into an output variable (which you should be doing anyway):
S = load(..);
Then synchronize:
F2 = @(a) array2timetable(a(:,2),'RowTimes',datetime(a(:,1),'ConvertFrom','datenum'));
C = struct2cell(structfun(F2,S, 'UniformOutput',false))
C = 3x1 cell array
{ 13x1 timetable} {100x1 timetable} { 23x1 timetable}
T = synchronize(C{:});
T.Properties.VariableNames = fieldnames(S)
T = 118x3 timetable
Time Var1 Var2 Var3 ____________________ ____ ____ ____ 20-Dec-2024 01:02:00 NaN NaN 6 20-Dec-2024 01:02:00 NaN 1 NaN 20-Dec-2024 01:02:01 NaN 5 NaN 20-Dec-2024 01:02:02 NaN 8 NaN 20-Dec-2024 01:02:02 NaN NaN 5 20-Dec-2024 01:02:02 NaN 3 NaN 20-Dec-2024 01:02:03 NaN 8 NaN 20-Dec-2024 01:02:03 NaN 1 NaN 20-Dec-2024 01:02:04 NaN 3 NaN 20-Dec-2024 01:02:04 NaN 9 NaN 20-Dec-2024 01:02:05 NaN 8 NaN 20-Dec-2024 01:02:06 4 NaN NaN 20-Dec-2024 01:02:06 NaN NaN 4 20-Dec-2024 01:02:07 6 NaN NaN 20-Dec-2024 01:02:07 NaN 7 NaN 20-Dec-2024 01:02:07 NaN 7 NaN

Star Strider
Star Strider ungefär en timme ago
Edited: Star Strider 11 minuter ago
First, convert the datenum values to datetime arrays using the ConvertFrom='datenum' name-value pair, either reading each of them in as a timetable using the readtimetable function,or reading them in as a table and then using thee table2timetable function to convert them. I’m not certain what you want to do after that, however if I understand your question correctly, the synchronize function would be my approach. This can be done in a sinngle synchronize call, or it may be necesary to do this in a loop, adding each new timetable in each iteration.
EDIT — (20 Dec 2024 at 14:37)
That approach would look something like this —
TT{1} = timetable(datetime(2024,12,18)+hours(sort(rand(10,1)*10)), randn(10,1), VariableNames={'Voltage 1'});
TT{2} = timetable(datetime(2024,12,18)+hours(sort(rand(12,1))*10+4), randn(12,1), VariableNames={'Voltage 2'});
TT{3} = timetable(datetime(2024,12,19)+hours(sort(rand(8,1))), randn(8,1), VariableNames={'Voltage 3'});
for k = 1:numel(TT)
disp("height(TT{"+k+"} = "+height(TT{k}))
disp(TT{k})
end
height(TT{1} = 10
Time Voltage 1 ____________________ _________ 18-Dec-2024 00:30:09 0.0025436 18-Dec-2024 01:12:16 1.5866 18-Dec-2024 01:46:06 0.55173 18-Dec-2024 03:27:36 0.030876 18-Dec-2024 03:57:06 -1.7818 18-Dec-2024 04:51:49 -0.23971 18-Dec-2024 06:52:40 -0.13275 18-Dec-2024 07:26:40 -0.49796 18-Dec-2024 08:09:45 -1.1161 18-Dec-2024 09:00:27 1.0313
height(TT{2} = 12
Time Voltage 2 ____________________ _________ 18-Dec-2024 04:02:10 -1.2783 18-Dec-2024 05:12:40 -0.65534 18-Dec-2024 07:29:17 1.3311 18-Dec-2024 08:36:45 -0.86763 18-Dec-2024 08:44:01 -1.1064 18-Dec-2024 09:17:04 0.44575 18-Dec-2024 10:06:29 2.3994 18-Dec-2024 11:48:09 -0.61237 18-Dec-2024 12:01:18 -1.0362 18-Dec-2024 12:08:49 -0.12372 18-Dec-2024 12:25:15 0.38295 18-Dec-2024 13:16:56 -0.17534
height(TT{3} = 8
Time Voltage 3 ____________________ _________ 19-Dec-2024 00:04:16 2.1787 19-Dec-2024 00:08:52 1.1046 19-Dec-2024 00:17:02 -0.4657 19-Dec-2024 00:24:29 -1.1462 19-Dec-2024 00:29:36 -0.52991 19-Dec-2024 00:34:15 0.035328 19-Dec-2024 00:42:22 1.114 19-Dec-2024 00:45:42 -0.75405
TT_Combined = TT{1};
for k = 2:numel(TT)
TT_Combined = synchronize(TT_Combined, TT{k});
end
disp("size(TT_Combined = ["+size(TT_Combined,1)+" "+size(TT_Combined,2)+"]")
size(TT_Combined = [30 3]
disp(TT_Combined)
Time Voltage 1 Voltage 2 Voltage 3 ____________________ _________ _________ _________ 18-Dec-2024 00:30:09 0.0025436 NaN NaN 18-Dec-2024 01:12:16 1.5866 NaN NaN 18-Dec-2024 01:46:06 0.55173 NaN NaN 18-Dec-2024 03:27:36 0.030876 NaN NaN 18-Dec-2024 03:57:06 -1.7818 NaN NaN 18-Dec-2024 04:02:10 NaN -1.2783 NaN 18-Dec-2024 04:51:49 -0.23971 NaN NaN 18-Dec-2024 05:12:40 NaN -0.65534 NaN 18-Dec-2024 06:52:40 -0.13275 NaN NaN 18-Dec-2024 07:26:40 -0.49796 NaN NaN 18-Dec-2024 07:29:17 NaN 1.3311 NaN 18-Dec-2024 08:09:45 -1.1161 NaN NaN 18-Dec-2024 08:36:45 NaN -0.86763 NaN 18-Dec-2024 08:44:01 NaN -1.1064 NaN 18-Dec-2024 09:00:27 1.0313 NaN NaN 18-Dec-2024 09:17:04 NaN 0.44575 NaN 18-Dec-2024 10:06:29 NaN 2.3994 NaN 18-Dec-2024 11:48:09 NaN -0.61237 NaN 18-Dec-2024 12:01:18 NaN -1.0362 NaN 18-Dec-2024 12:08:49 NaN -0.12372 NaN 18-Dec-2024 12:25:15 NaN 0.38295 NaN 18-Dec-2024 13:16:56 NaN -0.17534 NaN 19-Dec-2024 00:04:16 NaN NaN 2.1787 19-Dec-2024 00:08:52 NaN NaN 1.1046 19-Dec-2024 00:17:02 NaN NaN -0.4657 19-Dec-2024 00:24:29 NaN NaN -1.1462 19-Dec-2024 00:29:36 NaN NaN -0.52991 19-Dec-2024 00:34:15 NaN NaN 0.035328 19-Dec-2024 00:42:22 NaN NaN 1.114 19-Dec-2024 00:45:42 NaN NaN -0.75405
Another approach could be:
TT_Combined = synchronize(cat(1,TT{:}))
However this synchronizes them all in to one timetable with one variable, not separate variables, so all the variable names would have to be the same.
.

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!