Merge two columns in timetable?

12 views (last 30 days)
I'm merging a number of .csv files into one big timetable. Each of the .csv files represents one day's data from a radio station - one column for frequency, one column for amplitude. I'd like for each station to have its own frequency column and its own amplitude column in the final timetable, but since I am adding the columns to the timetable one file at a time using the synchronize() function I wind up with multiple columns for each.
Here's the for loop I'm generating to create the timetable. The original files are named analysis_XXXXXX - Callsign.csv, where XXXXX is the date and Callsign is the station identifier.
for j = 1:length(Files)
% Extract date and call from filename.
meta = extractBefore(FileNames(j), '.csv')
meta = split(meta, ' - ')
date_recorded = char(extractAfter(meta(1), 'analysis_'))
day = str2num([date_recorded(1:2); date_recorded(3:4); date_recorded(5:6)])
start = datetime([2000+day(1) day(2) day(3) 0 0 0]);
Call = meta(2);
foo = readtable(FileNames(j), 'Range','A:E');
foo.Properties.VariableNames = [{'UTC'},{'Freq'},{'FreqErr'},{'Vpk'},{'dBV_Vpk_'}];
foo=table2timetable(foo);
foo.UTC = foo.UTC + start;
foo.Properties.VariableNames = strcat(foo.Properties.VariableNames, '-', Call);
if exist('baz')
baz = synchronize(foo, baz);
else
baz = foo;
end
end
This results in columns named, say, Freq-Station1_foo which has values for Day 1 of data collection and NaNs for Day 2, next to Freq-Station1_baz which has values for Day 2 and NaNs for Day 1. Is there a way I can merge them?
  2 Comments
Cris LaPierre
Cris LaPierre on 2 Oct 2020
It would be easier to offer suggestions if you could share a few of your files. You can use the paperclip icon to attach them to your post.
Kristina Collins
Kristina Collins on 6 Dec 2020
Here's a minimal working example of the problem I'm trying to figure out.
load outdoors
TT1 = head(outdoors); TT1.PressureHg = [];
TT2 = TT1; TT2.TemperatureF = [];
TT2.Time = TT2.Time + hours(4);
outerjoin(TT1, TT2, "MergeKeys", 1)
If you execute that, you'll see that you end up with two Humidity columns that have different names. I'm looking for a command that will combine both into a single column. Is there a way to do this with the join() or outerjoin() commands?

Sign in to comment.

Accepted Answer

Seth Furman
Seth Furman on 21 Oct 2020
One simple way to combine these variables is to add them together in a new timetable variable and delete the old variables.
For example, assuming the table looks something like this,
>> baz = synchronize(t1,t2)
baz =
9×8 timetable
UTC Freq_t1 FreqErr_t1 Vpk_t1 dBV_Vpk__t1 Freq_t2 FreqErr_t2 Vpk_t2 dBV_Vpk__t2
____________________ _______ __________ ______ ___________ _______ __________ ______ ___________
01-Jan-2020 00:00:00 9 10 8 4 4 NaN 3 2
01-Jan-2020 06:00:00 NaN 2 10 7 8 7 8 3
01-Jan-2020 12:00:00 2 NaN 7 2 8 7 3 NaN
01-Jan-2020 18:00:00 10 10 1 8 2 NaN 6 3
02-Jan-2020 00:00:00 7 5 9 1 5 2 NaN 9
02-Jan-2020 06:00:00 1 9 10 3 5 5 9 3
02-Jan-2020 12:00:00 3 NaN 7 1 7 10 10 10
02-Jan-2020 18:00:00 6 5 NaN NaN 8 4 6 4
03-Jan-2020 00:00:00 10 10 8 9 8 NaN 2 2
we can do
>> baz.FreqErr = sum([baz.FreqErr_t1 baz.FreqErr_t2],2,"omitnan");
>> baz = removevars(baz,["FreqErr_t1","FreqErr_t2"])
baz =
9×7 timetable
UTC Freq_t1 Vpk_t1 dBV_Vpk__t1 Freq_t2 Vpk_t2 dBV_Vpk__t2 FreqErr
____________________ _______ ______ ___________ _______ ______ ___________ _______
01-Jan-2020 00:00:00 9 8 4 4 3 2 10
01-Jan-2020 06:00:00 NaN 10 7 8 8 3 9
01-Jan-2020 12:00:00 2 7 2 8 3 NaN 7
01-Jan-2020 18:00:00 10 1 8 2 6 3 10
02-Jan-2020 00:00:00 7 9 1 5 NaN 9 7
02-Jan-2020 06:00:00 1 10 3 5 9 3 14
02-Jan-2020 12:00:00 3 7 1 7 10 10 10
02-Jan-2020 18:00:00 6 NaN NaN 8 6 4 9
03-Jan-2020 00:00:00 10 8 9 8 2 2 10
  7 Comments
Kristina Collins
Kristina Collins on 10 Dec 2020
Thanks, Seth! I figured out a solution, but yours is much better articulated.
This code exists inside a for loop where each file is assigned to the timetable "audio," and all the variables are unified inside a timetable under the name "data." (Terrible variable names, yes.) The label for each station is a string with the variable name "Call", which changes on each iteration of the for loop.
% Create a big ol' timetable with all our audio data:
audio.Properties.VariableNames=Call;
if exist('data') ==1 % If we're not on our first round
if any(strcmp(Call, data.Properties.VariableNames))
fprintf('We already have some data from this station...')
audio = vertcat(audio, timetable(data.Time, data{:, Call}, 'VariableNames', Call));
data=removevars(data, Call); %delete the duplicate
end
data = synchronize(data, audio); %add data for the station (back?) into the big table
else % if we are on our first round:
data = audio;
end
Seth Furman
Seth Furman on 11 Dec 2020
Awesome. Glad you found a workable way to build up the intended timetable. Thank you for sharing your solution with us.
One quick thing about the code in your solution:
  • timetable(data.Time, data{:, Call}, 'VariableNames', Call) can be replaced with data(:,Call). This expression is creating a copy of data, but with only the variables in Call, which can be done more succinctly with table subscripting.

Sign in to comment.

More Answers (0)

Categories

Find more on Timetables in Help Center and File Exchange

Products


Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!