Finding out the missing dates and times from the time series data

65 views (last 30 days)
Hi to All
Please how to find the missing data (v) linke it with a time column or Julian time
Julian time time v
2455932.229 05/01/2012 17:30 2.095
2455932.271 05/01/2012 18:30 2.096
2455932.313 05/01/2012 19:30 2.098
The example is in the data file.
my appreciation to all.
  1 Comment
Walter Roberson
Walter Roberson on 26 Dec 2024 at 20:35
05/01/2012 17:30:00.000001 is missing
05/01/2012 17:30:00.000002 is missing
05/01/2012 17:30:00.00000201 is missing...
Your question is not well-defined.

Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 26 Dec 2024 at 17:06
Edited: Star Strider on 26 Dec 2024 at 17:15
I am not certain what result you want.
One option is to read it in as (or convert it to) a timetable and then use the retime function —
T1 = readtable('data.txt', VariableNamingRule='preserve')
T1 = 150x3 table
Jday Datetime WL (m) __________ ________________ ______ 2.4559e+06 05/01/2012 17:30 2.095 2.4559e+06 05/01/2012 18:30 2.096 2.4559e+06 05/01/2012 19:30 2.098 2.4559e+06 05/01/2012 20:30 2.099 2.4559e+06 05/01/2012 21:30 2.1 2.4559e+06 05/01/2012 22:30 2.101 2.4559e+06 05/01/2012 23:30 2.102 2.4559e+06 06/01/2012 00:30 2.103 2.4559e+06 06/01/2012 01:30 2.103 2.4559e+06 06/01/2012 02:30 2.104 2.4559e+06 06/01/2012 03:30 2.105 2.4559e+06 06/01/2012 04:30 2.105 2.4559e+06 06/01/2012 05:30 2.105 2.4559e+06 06/01/2012 06:30 2.117 2.4559e+06 06/01/2012 07:30 2.116 2.4559e+06 06/01/2012 08:30 2.104
% DTJ = datetime(T1.Jday,ConvertFrom='juliandate') % Information
TT1 = table2timetable(T1);
newt = T1.Datetime(1) : hours(1) : T1.Datetime(end);
TT1r = retime(TT1, newt,'linear')
TT1r = 187x2 timetable
Datetime Jday WL (m) ________________ __________ ______ 05/01/2012 17:30 2.4559e+06 2.095 05/01/2012 18:30 2.4559e+06 2.096 05/01/2012 19:30 2.4559e+06 2.098 05/01/2012 20:30 2.4559e+06 2.099 05/01/2012 21:30 2.4559e+06 2.1 05/01/2012 22:30 2.4559e+06 2.101 05/01/2012 23:30 2.4559e+06 2.102 06/01/2012 00:30 2.4559e+06 2.103 06/01/2012 01:30 2.4559e+06 2.103 06/01/2012 02:30 2.4559e+06 2.104 06/01/2012 03:30 2.4559e+06 2.105 06/01/2012 04:30 2.4559e+06 2.105 06/01/2012 05:30 2.4559e+06 2.105 06/01/2012 06:30 2.4559e+06 2.117 06/01/2012 07:30 2.4559e+06 2.116 06/01/2012 08:30 2.4559e+06 2.104
MissingDatesTimes = setdiff(newt(:), T1.Datetime); % Miissing Dates & Times
disp(MissingDatesTimes)
10/01/2012 19:30 10/01/2012 20:30 10/01/2012 21:30 10/01/2012 22:30 10/01/2012 23:30 11/01/2012 00:30 11/01/2012 01:30 11/01/2012 02:30 11/01/2012 03:30 11/01/2012 04:30 11/01/2012 05:30 11/01/2012 06:30 11/01/2012 07:30 11/01/2012 08:30 11/01/2012 09:30 11/01/2012 10:30 11/01/2012 11:30 11/01/2012 12:30 11/01/2012 13:30 11/01/2012 14:30 11/01/2012 15:30 11/01/2012 16:30 11/01/2012 17:30 11/01/2012 18:30 12/01/2012 08:30 12/01/2012 09:30 12/01/2012 10:30 12/01/2012 11:30 12/01/2012 12:30 12/01/2012 13:30 12/01/2012 14:30 12/01/2012 15:30 12/01/2012 16:30 12/01/2012 17:30 13/01/2012 01:30 13/01/2012 02:30 13/01/2012 03:30
figure
tiledlayout(2,1)
nexttile
plot(T1.Datetime, T1.('WL (m)'), '.-b')
title('Original')
grid
nexttile
plot(TT1r.Datetime, TT1r.('WL (m)'), '.-r')
title('Interpolated')
grid
The ‘Jday’ times do not precisely match the ‘Datetime’ times, however they are close. I went with the ‘Datetime’ time here.
EDIT — (26 Dec 2024 at 17:15)
Added ‘MissingDatesTimes’ and corrected typographical errors.
.
  2 Comments
Nada
Nada on 27 Dec 2024 at 5:30
Big thanks Star Strider
How can I create a new table containing the missing data (time) and prediction value WL?
Star Strider
Star Strider on 27 Dec 2024 at 12:06
As always, my pleasure!
The setdiff function has a second output that returns the indices of the missing rows (in this instance). Use it to return the rows of the ‘TT1r’ table that corresponds to the missing (interpolated) data—
T1 = readtable('data.txt', VariableNamingRule='preserve')
T1 = 150x3 table
Jday Datetime WL (m) __________ ________________ ______ 2.4559e+06 05/01/2012 17:30 2.095 2.4559e+06 05/01/2012 18:30 2.096 2.4559e+06 05/01/2012 19:30 2.098 2.4559e+06 05/01/2012 20:30 2.099 2.4559e+06 05/01/2012 21:30 2.1 2.4559e+06 05/01/2012 22:30 2.101 2.4559e+06 05/01/2012 23:30 2.102 2.4559e+06 06/01/2012 00:30 2.103 2.4559e+06 06/01/2012 01:30 2.103 2.4559e+06 06/01/2012 02:30 2.104 2.4559e+06 06/01/2012 03:30 2.105 2.4559e+06 06/01/2012 04:30 2.105 2.4559e+06 06/01/2012 05:30 2.105 2.4559e+06 06/01/2012 06:30 2.117 2.4559e+06 06/01/2012 07:30 2.116 2.4559e+06 06/01/2012 08:30 2.104
% DTJ = datetime(T1.Jday,ConvertFrom='juliandate') % Information
TT1 = table2timetable(T1);
newt = T1.Datetime(1) : hours(1) : T1.Datetime(end);
TT1r = retime(TT1, newt,'linear')
TT1r = 187x2 timetable
Datetime Jday WL (m) ________________ __________ ______ 05/01/2012 17:30 2.4559e+06 2.095 05/01/2012 18:30 2.4559e+06 2.096 05/01/2012 19:30 2.4559e+06 2.098 05/01/2012 20:30 2.4559e+06 2.099 05/01/2012 21:30 2.4559e+06 2.1 05/01/2012 22:30 2.4559e+06 2.101 05/01/2012 23:30 2.4559e+06 2.102 06/01/2012 00:30 2.4559e+06 2.103 06/01/2012 01:30 2.4559e+06 2.103 06/01/2012 02:30 2.4559e+06 2.104 06/01/2012 03:30 2.4559e+06 2.105 06/01/2012 04:30 2.4559e+06 2.105 06/01/2012 05:30 2.4559e+06 2.105 06/01/2012 06:30 2.4559e+06 2.117 06/01/2012 07:30 2.4559e+06 2.116 06/01/2012 08:30 2.4559e+06 2.104
[MissingDatesTimes,ia] = setdiff(newt(:), T1.Datetime); % Miissing Dates & Times With Index
MissingDatesTimesWL = TT1r(ia,:);
disp(MissingDatesTimesWL)
Datetime Jday WL (m) ________________ __________ ______ 10/01/2012 19:30 2.4559e+06 2.0971 10/01/2012 20:30 2.4559e+06 2.0972 10/01/2012 21:30 2.4559e+06 2.0974 10/01/2012 22:30 2.4559e+06 2.0975 10/01/2012 23:30 2.4559e+06 2.0976 11/01/2012 00:30 2.4559e+06 2.0977 11/01/2012 01:30 2.4559e+06 2.0978 11/01/2012 02:30 2.4559e+06 2.098 11/01/2012 03:30 2.4559e+06 2.0981 11/01/2012 04:30 2.4559e+06 2.0982 11/01/2012 05:30 2.4559e+06 2.0983 11/01/2012 06:30 2.4559e+06 2.0984 11/01/2012 07:30 2.4559e+06 2.0986 11/01/2012 08:30 2.4559e+06 2.0987 11/01/2012 09:30 2.4559e+06 2.0988 11/01/2012 10:30 2.4559e+06 2.0989 11/01/2012 11:30 2.4559e+06 2.099 11/01/2012 12:30 2.4559e+06 2.0992 11/01/2012 13:30 2.4559e+06 2.0993 11/01/2012 14:30 2.4559e+06 2.0994 11/01/2012 15:30 2.4559e+06 2.0995 11/01/2012 16:30 2.4559e+06 2.0996 11/01/2012 17:30 2.4559e+06 2.0998 11/01/2012 18:30 2.4559e+06 2.0999 12/01/2012 08:30 2.4559e+06 2.1308 12/01/2012 09:30 2.4559e+06 2.1276 12/01/2012 10:30 2.4559e+06 2.1245 12/01/2012 11:30 2.4559e+06 2.1213 12/01/2012 12:30 2.4559e+06 2.1181 12/01/2012 13:30 2.4559e+06 2.1149 12/01/2012 14:30 2.4559e+06 2.1117 12/01/2012 15:30 2.4559e+06 2.1085 12/01/2012 16:30 2.4559e+06 2.1054 12/01/2012 17:30 2.4559e+06 2.1022 13/01/2012 01:30 2.4559e+06 2.106 13/01/2012 02:30 2.4559e+06 2.106 13/01/2012 03:30 2.4559e+06 2.106
figure
tiledlayout(3,1)
nexttile
plot(T1.Datetime, T1.('WL (m)'), '.-b')
title('Original')
grid
ylim('padded')
nexttile
plot(TT1r.Datetime, TT1r.('WL (m)'), '.-r')
title('Original With Interpolated')
grid
ylim('padded')
Ax2 = gca;
nexttile
plot(MissingDatesTimesWL.Datetime, MissingDatesTimesWL.('WL (m)'), '.-g')
title('Only Interpolated')
grid
Ax3 = gca;
Ax3.XLim = Ax2.XLim;
Ax3.YLim = Ax2.YLim;
The ‘missing’ (interpolated) dates, timies, and ‘Water Level’ (?) are now in the ‘MissingDatesTimesWL’ table. Note that I used the ‘linear’ interpolation method. Others are available, see the retime documentatiion I linked to above for details.
.

Sign in to comment.

More Answers (2)

dpb
dpb on 26 Dec 2024 at 16:44
Moved: dpb on 26 Dec 2024 at 16:44
tD=readtable('data.txt'); % read the file
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
head(tD,15) % see what it contains
Jday Datetime WL_m_ __________ ________________ _____ 2.4559e+06 05/01/2012 17:30 2.095 2.4559e+06 05/01/2012 18:30 2.096 2.4559e+06 05/01/2012 19:30 2.098 2.4559e+06 05/01/2012 20:30 2.099 2.4559e+06 05/01/2012 21:30 2.1 2.4559e+06 05/01/2012 22:30 2.101 2.4559e+06 05/01/2012 23:30 2.102 2.4559e+06 06/01/2012 00:30 2.103 2.4559e+06 06/01/2012 01:30 2.103 2.4559e+06 06/01/2012 02:30 2.104 2.4559e+06 06/01/2012 03:30 2.105 2.4559e+06 06/01/2012 04:30 2.105 2.4559e+06 06/01/2012 05:30 2.105 2.4559e+06 06/01/2012 06:30 2.117 2.4559e+06 06/01/2012 07:30 2.116
tD.Properties.VariableNames(2:end)={'Date','WL'}; % make convenient variable name
nnz(~isfinite(tD.WL)) % how many are missing?
ans = 0
None by that definition of missing...so what is the Q? about? Maybe the times aren't uniform???
histogram(diff(tD.Date))
Ah, there are some big gaps it appears...
ix=find(diff(tD.Date)>hours(1))
ix = 3×1
122 135 142
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
for i=1:numel(ix)
ix1=ix(i)-1; ix2=ix(i)+1; % the indice before/after the breaks
tD(ix1:ix2,:) % show the range
diff(tD.Date(ix(i):ix2)) % and the difference found
end
ans = 3x3 table
Jday Date WL __________ ________________ _____ 2.4559e+06 10/01/2012 17:30 2.095 2.4559e+06 10/01/2012 18:30 2.097 2.4559e+06 11/01/2012 19:30 2.1
ans = duration
25:00:00
ans = 3x3 table
Jday Date WL __________ ________________ _____ 2.4559e+06 12/01/2012 06:30 2.133 2.4559e+06 12/01/2012 07:30 2.134 2.4559e+06 12/01/2012 18:30 2.099
ans = duration
11:00:00
ans = 3x3 table
Jday Date WL __________ ________________ _____ 2.4559e+06 12/01/2012 23:30 2.105 2.4559e+06 13/01/2012 00:30 2.106 2.4559e+06 13/01/2012 04:30 2.106
ans = duration
04:00:00
So, the Q? is, now what do you want to do with these? (But the above answers the Q? asked, it appears)

Shantanu Dixit
Shantanu Dixit on 26 Dec 2024 at 16:57
Edited: Shantanu Dixit on 26 Dec 2024 at 17:11
Hi Nada,
To identify and interpolate missing data in your time series (the current shared data seem to have no missing values), you can use the 'isnan' function to find the missing values and then use 'interp1' for interpolation. Since your data is periodic, you can use the 'spline' interpolation method.
You can refer to the following approach in MATLAB:
% Indices of missing values in v
missingIdx = find(isnan(v));
% Interpolate the missing values in v based on time
% time to be in datetime format
vInterp = interp1(time(~isnan(v)), v(~isnan(v)), time, 'spline');
For more information you can refer to the following MathWorks documentation for interpolation:
Hope this helps!

Categories

Find more on Dates and Time in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!