How do I merge/average several datasets (Time, "Y," "Z")?

5 views (last 30 days)
Kevin Chung
Kevin Chung on 9 Jun 2021
Answered: Jaynik on 18 Apr 2024 at 5:45
Dear community,
I have a set of measured data (Time, Current, Voltage) and want to merge several measurments in one curve. As the data have different time lengths but specific segments, how can I merge the data since the specific segments have different lengths and time?
I want to merge the [TotTime, TotCurrent, TotVoltage] OR merge [TotTime, TotCurrent] and [TotTime, TotVoltage] separately but with same Time variable.
Please refer to the attached datasets.
Thanks in advance,
Kevin

Answers (1)

Jaynik
Jaynik on 18 Apr 2024 at 5:45
Hi Kevin,
One method to merge the data is take a full outer join based on the 'time' column and take a mean for the same values of time.
The "outerjoin" function can be used to merge the tables as it combines tables based on a key variable ('Time'), and it includes all rows from each table, even if there are no matching rows in the other tables. The 'MergeKeys' option is set to true to indicate that the key column is not repeated in the merged table.
For each time point, the mean has been calculated for the current and voltage across the three datasets ignoring any 'NaN' values.
Following is the code for the same:
load NewCellTestData.mat
load NewCellTestData2.mat
load NewCellTestData3.mat
% Convert the arrays to tables
T1 = array2table([TestData.TotTime, TestData.TotCurrent, TestData.TotVoltage], 'VariableNames', {'Time', 'Current', 'Voltage'});
T2 = array2table([TestData2.TotTime, TestData2.TotCurrent, TestData2.TotVoltage], 'VariableNames', {'Time', 'Current', 'Voltage'});
T3 = array2table([TestData3.TotTime, TestData3.TotCurrent, TestData3.TotVoltage], 'VariableNames', {'Time', 'Current', 'Voltage'});
% Merge the tables using full outer join
mergedTable = outerjoin(T1, T2, 'Keys', 'Time', 'MergeKeys', true);
mergedTable = outerjoin(mergedTable, T3, 'Keys', 'Time', 'MergeKeys', true);
% Compute the mean current and voltage ignoring NaNs
mergedTable.MeanCurrent = mean([mergedTable.Current_T1, mergedTable.Current_T2, mergedTable.Current], 2, "omitnan");
mergedTable.MeanVoltage = mean([mergedTable.Voltage_T1, mergedTable.Voltage_T2, mergedTable.Voltage], 2, "omitnan");
% If MeanCurrent is NaN, choose Current_T1, if that is NaN, choose Current_T2, else choose Current_T3
mergedTable.MeanCurrent(isnan(mergedTable.MeanCurrent)) = mergedTable.Current_T1(isnan(mergedTable.MeanCurrent));
mergedTable.MeanCurrent(isnan(mergedTable.MeanCurrent)) = mergedTable.Current_T2(isnan(mergedTable.MeanCurrent));
mergedTable.MeanCurrent(isnan(mergedTable.MeanCurrent)) = mergedTable.Current(isnan(mergedTable.MeanCurrent));
% Same operation as above for MeanVoltage
mergedTable.MeanVoltage(isnan(mergedTable.MeanVoltage)) = mergedTable.Voltage_T1(isnan(mergedTable.MeanVoltage));
mergedTable.MeanVoltage(isnan(mergedTable.MeanVoltage)) = mergedTable.Voltage_T2(isnan(mergedTable.MeanVoltage));
mergedTable.MeanVoltage(isnan(mergedTable.MeanVoltage)) = mergedTable.Voltage(isnan(mergedTable.MeanVoltage));
% Remove the original current and voltage columns to obtain final 3 columns
mergedTable(:, {'Current_T1', 'Voltage_T1', 'Current_T2', 'Voltage_T2', 'Current', 'Voltage'}) = [];
Since data is broken and not consistent across the three dataset, this approach gives a lot of 'NaN' which are removed. You will need to clean the dataset for better results and use the appropriate statistics measure instead of mean based on your requirements.
I hope this helps!

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!