Deep learning Toolbox - LSTM Training

Question

PB75 on 2 Sep 2022

0
Link

Direct link to this question

https://se.mathworks.com/matlabcentral/answers/1793525-deep-learning-toolbox-lstm-training

Edited: PB75 on 7 Sep 2022

Hi All,

I am building an LSTM ROM to integrate into my Simscape model, which is using training and test data captured in ANSYS Chemkin software.

I have had multiple issues with pre-processing the data, which I am working through, that main reason being is ANSYS only uses a variable step solver.

Current posts: Data Structure for LSTM, Resampling Arrays

I am basing my code on the LSTM ROM example. The validation of the LSTM model (before we integrate into the full Simscape model) will have to be a simple model using the "Stateful Predict" block, and then using the test and train data captured in the workspace to comapre to the LSTM model, using "From Workspace" array block to get the signals into Simulink.

I have been able to run the code, and the training progress seems that the model has minimised the RMSE and loss, however, when I try to run my simple validaton model the results show something else!, see below.

As I am unsure as to how to proceed with debugging, I am unsure whether it is the pre-processed data (which may need resampling), the LSTM network and parameter selection or the validation model that is the main cause.

My question is, can I now use the Deep Network Designer app to help debugging. Now using the code generated the concatenated data is now in a 10x1 cell, can I now save this as a datastore and use the app to train the network?

I can show the training progress, data structure and validation output below.

Training Progress:

Validation Model:

Validation Results: Test data top row, LSTM data bottom row.

Any suggestions on how to proceed would be great, as stuck as to how to proceed.

Thanks in advance,

Patrick

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Arkadiy Turevskiy on 2 Sep 2022

0
Link

Direct link to this answer

https://se.mathworks.com/matlabcentral/answers/1793525-deep-learning-toolbox-lstm-training#answer_1040785

Hi Patrick,

Two suggestions:

Try to test your LSTM network in MATLAB first. Does it match the validation data. If it does, then the issue is with a Simulink model.
If your validation data in Simulink does not start at time 0, you need to reset the state of LSTM in State and Predict block by putting this block into a resettable subsystem and triggering it before your data starts. Ie. if your data starts at t-0.2, then do a step from 0 to 1 at 0.1 sec and feed that into reset port of resettable susbystem to reset the state.

Arkadiy

3 Comments
Show 1 older commentHide 1 older comment

PB75 on 5 Sep 2022

Edited: PB75 on 6 Sep 2022

Open in MATLAB Online

Hi Arkadiy,

Thanks for taking the time to respond, and thanks for your suggestions, so I have been working through your suggestions, I have made some progress on the first point of testing the model in MATLAB, but have not made any on the second point of testing the model in Simulink

1). I have been able to use the predict function, and the network does have ouputStates (i.e.,signal 1) has a good output, but the accuracy of the outputStates seems to reduce on the remaining 3 signals. It may be as the units on all 4 signal difference (temp [K], pressure [bar], mole fraction [], mole fraction [], hence the scaling varies widely between the signal ranges. I have attached a screen grab of the results below and the predict function code and plotting. My question is do I now have to normalise the signals in XTrain and TTrain or dataTrain to improve the network results?

%Prepare the test data for prediction using the same steps as for the training data.
numObservationsTest = numel(dataTest);
for i = 1:numObservationsTest
    XTest{i} = dataTest{i}(inputStatesTrain,1:end-1);
    TTest{i} = dataTest{i}(outputStatesTrain,2:end);
end
%Make predictions on the test data using predict. To prevent the function from adding padding to the data, specify the mini-batch size 1.
YPred = predict(net,XTest,'MiniBatchSize',1);
%Extract Signals from XTest for plotting
for i = 1:1
    volume_test = XTest{i}(1,:);
    temperature_test = XTest{i}(2,:);
    pressure_test = XTest{i}(3,:);
    CO2_test = XTest{i}(4,:);
    NO_test = XTest{i}(5,:);
end
%Extract Signals from YPred for plotting
for i = 1:1
    temperature_pred = YPred{i}(1,:);
    pressure_pred = YPred{i}(2,:);
    CO2_pred = YPred{i}(3,:);
    NO_pred = YPred{i}(4,:);
end

2). I have tried to post-process the XTest data to employ the individual signals into Simulink, I am unsure as to how to trigger the Stateful Predict block as suggested, as all the XTest signals start at t = 0. Also, I am unsure what is the appropriate method to pull the XTest data and signals into the model besides using a From Workspace block, and creating a 2 x array for each signal (including creating a time vector for the arrays), which seems a bit rough. I have included a modified image of the model, and how I am pulling the XTest signals in, which is basically creating a 2 x array for each signal.

%Simulation from a XTest with Stateful Predict Block
%Create a time vector for the signal arrays for From Workspace blocks
numXTestSteps = length(temperature_test);        % Find number of steps in XTest Data
stopTime = times(end);                           % ANSYS Sim time in [ms]
stopTime = stopTime/1000;                        % Convert ANSYS Sim time to [s]
times_XTest =linspace(0,stopTime,numXTestSteps); % Create Time vecor for Signal Array in Simulink
%Create 2 x arrays for From Workspace Blocks
volume_test_Array       = [times_XTest' volume_test'];
temperature_test_Array  = [times_XTest' temperature_test'];
pressure_test_Array     = [times_XTest' pressure_test'];
CO2_test_Array          = [times_XTest' CO2_test'];
NO_test_Array           = [times_XTest' NO_test'];
%Find Initial Conditions for unit delay block in Model from XTest cell
temperature_init = temperature_test(1,1);
pressure_init = pressure_test(1,1);
CO2_init = CO2_test(1,1);
NO_init = NO_test(1,1);
XTest_init = [temperature_init pressure_init CO2_init NO_init];
XTest_init = XTest_init';
%Run Validation Model
mdl = 'Validation_Model';
open_system(mdl)
simout = sim("Validation_Model");

Ben on 6 Sep 2022

I'll add a few additional notes to Arkadiy's answer here to help debug further. A variation on 1. would be to try run the Simulink model with your training data, since you have an expectation of how that should perform from the training plot.

It may be useful to add additional outputs to the workspace exactly before the input and after the output of the Stateful Predict block. This will help check exactly what data is passed to the LSTM and that the LSTM outputs, and you can compare this to how the network was trained in MATLAB.

A few possible issues come to mind:

Arkadiy's point 2. on the LSTM state - it will only be reset at time 0, or when the stateful predict block is inside a resettable subsystem and a reset signal is received.
Simulink is taking different time steps to your data. The Rate Transition block has a NoOp label, but if you are using input data with uniform timesteps, and a variable step solver in Simulink, then I would expect this to be something else such as ZOH. This should be possible by setting the output port sample time of the Rate Transition block to match the uniform timesteps of your input data. Essentially the issue is that Simulink's variable step ODE solvers may take different timesteps to your input data which will corrupt the LSTM state. A potential alternative would be to swap Simulink's ODE solver to a fixed step ODE solver with step size matching your input data, but this may not be realistic when the ROM part of a larger system.
The validation data is simply too different to the training data and the model can't perform well on it. This will depend on the strategy used to split validation off from the initial dataset. You could check this by including the validation data during training - the plot will then contain the current loss on the validation data, which should roughly follow the training loss if the model is generalising well to the validation data.

Hope that helps,

Ben

PB75 on 7 Sep 2022

Edited: PB75 on 7 Sep 2022

Open in MATLAB Online

Hi Ben,

Thanks for the extra guidance, so I have made a little progress with testing the network in a Simulink model, I will show the state predict block in my main model below and the ouput. My error seems to be the order of the inputs into the mux and sequenceinput. I do have an output from the state predict block now, but the results are poorer than those of the predict function?

Additionally, I was able to use the interp1 command and was able to resize the raw ANSYS data and effectively resample it, Resampling. So now I have all the data at a fixed time step of 2.0E-05 sec prior to post-processing, and now with a downsample factor of 5, which gives a reasonable step size of 1.0E-04 sec for the training and test data I think?

All my sample times in the state predict subsystem triggered block mirrors the sample rate used in the training and test of 1.0E-04 sec. Also I am using a fixed step solver.

I can test the network with the predict function, as show below. The output for the first output state is good, but the output of the remaining 3 states gets poor, which may mean I need to norm all the data prior to preparing the data. I have done a post for this, hope someone can help with my code errors for finding the maximum value and normalising the cell containing the data Normalising a cell. The fidelity of the predict function and state predict outputs seem vastly different, even with the same step sizes of 1.0E-04 sec.

Predict Function Ouput:

Simulink Model Output:

The top signal is the trigger signal, signals 2 are the model total volume and the triggered LSTM volume, signals 3, 4, 5 and 6 are the network ouput.

Simulink Model - sample rates set to 1.0E-04 sec as per LSTM training,and a fixed rate solver (ODE4)

Any suggestions on how to match the predict function output and state predict output would be great, and how to improve the network results in general.

Thanks again,

Patrick

Sign in to comment.

Deep learning Toolbox - LSTM Training

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

3 Comments
Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Deep learning Toolbox - LSTM Training

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

3 Comments Show 1 older commentHide 1 older comment

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment