Closed loop with LSTM for time series

42 views (last 30 days)
massimo giannini
massimo giannini on 15 Aug 2024
Edited: Umar on 17 Aug 2024
Dear All
I am in troubling trying to perform a multi-ahead (closed loop) forecasting for a time series. I use Matlab2024 and the "old" command predictAndUpdate does not work with dlnetwork objects. I saw all the possible documentation, theoretically I can manage the question but practically not!. I sum up the excercise. Single univariate time serie (stock price) in LSTM net. No problem with trainnet and predict using historical training and test data. Now I'd like to go out of test data. This is the code
X = (XTest);
T = YTest;
%net2 is a dlnetwork object
offset = length(X);
[Z,state] = predict(net2,X(1:offset)); %predict and update state using test data
net2.State = state;
% five-steps ahead
numPredictionTimeSteps = 5;
Y = (zeros(numPredictionTimeSteps));
Y(1,:) = Z(end); %use the last forecast as starting point for the loop
for t = 2:numPredictionTimeSteps
[Y(:,t),state] = predict(net2,Y(:,t-1));
net2.State = state;
end
%I got:
Error using extractState (line 41)
If the hidden state argument is a matrix, then its number of observations must match the number of observations of the
input data.
I found a similar question on the web for the GRU net. If I restart the net before the loop (as suggested) it works but the forecast is very poor, so I wonder whether the restart can affect such a result. Is there an alternative to reset? I attach the data and the net.
thanks in advance

Answers (1)

Umar
Umar on 15 Aug 2024

Hi @ massimo giannini,

In LSTM networks, maintaining the correct state dimensions is crucial when making successive predictions. When you use predict, it expects the input shape to align with what it was trained on. Instead of resetting the network at each prediction step, pass the correct state dimensions into your predictions consistently. The typical approach is to initialize the state from your last prediction and update it iteratively. Here is an adjusted version of your loop that maintains state consistency without resetting:

   X = (XTest);
   T = YTest;
   offset = length(X);
   [Z,state] = predict(net2,X(1:offset)); % Initial prediction
   net2.State = state; 
   % Prepare for multi-step ahead forecasting
   numPredictionTimeSteps = 5;
   % Adjust Y's size based on Z
   Y = zeros(size(Z, 1), numPredictionTimeSteps);
   Y(:, 1) = Z(end); % Use last forecast as starting point
   for t = 2:numPredictionTimeSteps
       % Predict using previous output
       [Y(:, t), state] = predict(net2, Y(:, t-1)); 
       net2.State = state; % Update the network state
   end

Also, if your LSTM expects a specific input size or format (e.g., a column vector vs. a matrix), ensure that `Y(:, t-1)` matches those expectations. Based on observation of your code, you may want to explore other forecasting strategies such as using ensemble methods or combining LSTM outputs with other models to improve robustness in predictions. Hope this helps. Please let me know if you have any further questions.

  5 Comments
massimo giannini
massimo giannini on 16 Aug 2024
Hi Umar many thanks! I am on "old" econometrician and I am studying deep learning now. My dissertation was on ARIMA and GARCH (35 years ago). But now I'd like to investigate "new" methods. As said, I am moving from R to matlab but I found explanations in mathworks poorly useful; they have a lot of examples but technical details are missing. As an example: if I use OptionMode=sequence in LSTM, I was able to obtain one single prediction at each out of sample step but if I use last (as I did in the code I sent you) I obtain a vector, following your help. This is not clear to me. As I want a sequence-to-one net, "last" should be the right choice.
Have you a good technical textbooks to suggest?
many thanks
Umar
Umar on 17 Aug 2024
Edited: Umar on 17 Aug 2024

Hi @massimo giannini,

Regarding your question about OptionModein LSTM configurations, let me clarify when you set OptionMode=sequence, the LSTM network processes the entire input sequence and returns a prediction for each time step. This is useful for tasks where you need a prediction at every step, such as in time series forecasting where you want to track the evolution of predictions over time. Conversely, when you use OptionMode=last, the network only returns the prediction corresponding to the last time step of the input sequence. This is particularly useful for sequence-to-one tasks, where you want a single output for a given input sequence.

Now, let’s focus on your comment regarding, “. As I want a sequence-to-one net, "last" should be the right choice.”

I will break down my provided code snippet to clarify how to implement a sequence-to-one network using the last mode.

X = (XTest); % Input test data
T = YTest;   % Target test data
offset = length(X); % Determine the length of the input data
% Initial prediction using the entire input sequence
[Z,state] = predict(net2,X(1:offset)); % Predict and update state
net2.State = state; % Update the network state for future predictions
% Prepare for multi-step ahead forecasting
numPredictionTimeSteps = 5; % Number of future time steps to predict
Y = zeros(size(Z, 1), numPredictionTimeSteps); % Initialize output matrix
Y(:, 1) = Z(end); % Use the last forecast as the starting point
for t = 2:numPredictionTimeSteps
    % Predict the next time step using the last output
    [Y(:, t), state] = predict(net2, Y(:, t-1)); 
    net2.State = state; % Update the network state
end

So, in the code, input data X and target data T are defined. The offset variable captures the length of the input data, which is crucial for determining how much data to feed into the network for the initial prediction. Afterwards, the first prediction is made using the entire input sequence. The output Z contains the predictions for each time step, and the state of the network is updated accordingly. Also, the output matrix Y is initialized to store predictions for the specified number of future time steps. The first column of Y is set to the last prediction from Z, which serves as the starting point for subsequent predictions. The loop iterates to predict future time steps and each iteration uses the last predicted value as input for the next prediction, effectively chaining the predictions together. Finally, the network state is updated after each prediction to maintain continuity.

I truly understand about your transition approach from traditional econometric models like ARIMA and GARCH to deep learning techniques such as LSTM which can indeed be challenging, especially when moving between programming environments like R and MATLAB.

Finally, addressing your question regarding, “Have you a good technical textbooks to suggest?”

For a deeper understanding of LSTM and other deep learning techniques in MATLAB, I recommend the following textbooks:

*Introduction to Machine Learning with Python: A Guide for Data Scientists Book by Andreas C. Muller and Sarah Guido

*LSTM Networks : Exploring the Evolution and Impact of Long Short-Term Memory Networks in Machine Learning Kindle Edition by Henri van Maarseveen

*Deep Learning: Recurrent Neural Networks in Python: LSTM, GRU, and more RNN machine learning architectures in Python and Theano (Machine Learning in Python) Kindle Edition by LazyProgrammer

Hope, I have answered all your questions.

Sign in to comment.

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!