Main Content

Run Sequence Forecasting on FPGA by Using Deep Learning HDL Toolbox

This example shows how to create, compile, and deploy a long short-term memory (LSTM) network trained on waveform data by using the Deep Learning HDL Toolbox™ Support Package for Xilinx FPGA and SoC. Use the deployed network to predict future values by using open-loop and closed-loop forecasting. Use MATLAB® to retrieve the prediction results from the target device.

Waveform Data Network

The network attached to this example was trained using the Time Series Forecasting Using Deep Learning. This example uses the WaveformData.mat data set, which contains 2000 synthetically generated waveforms of varying lengths with three channels. This example uses a trained LSTM network to forecast future values of the waveforms given the values from the previous time steps using both closed loop and open loop forecasting.

Prerequisites

  • Xilinx® Zynq® Ultrascale+™ ZCU102 SoC development kit

  • Deep Learning HDL Toolbox™ Support Package for Xilinx FPGA and SoC

  • Deep Learning Toolbox™

  • Deep Learning HDL Toolbox™

Load the Pretrained Network

To load the LSTM network enter:

load WaveformForcastingNet

Use the analyzeNetwork function to obtain information about the network layers. the function returns a graphical representation of the network that contains detailed parameter information for every layer in the network.

analyzeNetwork(net)

Define FPGA Board Interface

Define the target FPGA board programming interface by using the dlhdl.Target object. Specify that the interface is for a Xilinx board with an Ethernet interface.

To create the target object, enter:

hTarget = dlhdl.Target('Xilinx','Interface','Ethernet');

To use the JTAG interface, install Xilinx™ Vivado™ Design Suite 2022.1. To set the Xilinx Vivado toolpath, enter:

hdlsetuptoolpath('ToolName', 'Xilinx Vivado', 'ToolPath', 'C:\Xilinx\Vivado\2022.1\bin\vivado.bat');
hTarget = dlhdl.Target('Xilinx','Interface','JTAG');

Prepare Network for Deployment

Prepare the network for deployment by creating a dlhdl.Workflow object. Specify the network and the bitstream name. Ensure that the bitstream name matches the data type and the FPGA board. In this example the target FPGA board is the Xilinx ZCU102 SOC board. The bitstream uses a single data type.

hW = dlhdl.Workflow('network', net, 'Bitstream', 'zcu102_lstm_single','Target',hTarget);

Tu run the example on the Xilinx ZC706 board, enter:

hW = dlhdl.Workflow('Network', snet, 'Bitstream', 'zc706_lstm_single','Target',hTarget);

Compile the LSTM Network

Run the compile method of the dlhdl.Workflow object to compile the network and generate the instructions, weights, and biases for deployment. The total number of frames exceeds the default value of 30. Set the InputFrameNumberLimit name-value argument to 1000 to run predictions in chunks of 1000 frames to prevent timeouts.

dn = compile(hW,'InputFrameNumberLimit',1000)
### Compiling network for Deep Learning FPGA prototyping ...
### Targeting FPGA bitstream zcu102_lstm_single.
### The network includes the following layers:
     1   'sequenceinput'      Sequence Input      Sequence input with 3 dimensions             (SW Layer)
     2   'lstm'               LSTM                LSTM with 128 hidden units                   (HW Layer)
     3   'fc'                 Fully Connected     3 fully connected layer                      (HW Layer)
     4   'regressionoutput'   Regression Output   mean-squared-error with response 'Response'  (SW Layer)
                                                                                             
### Notice: The layer 'sequenceinput' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software.
### Notice: The layer 'regressionoutput' with type 'nnet.cnn.layer.RegressionOutputLayer' is implemented in software.
### Compiling layer group: lstm.wi ...
### Compiling layer group: lstm.wi ... complete.
### Compiling layer group: lstm.wo ...
### Compiling layer group: lstm.wo ... complete.
### Compiling layer group: lstm.wg ...
### Compiling layer group: lstm.wg ... complete.
### Compiling layer group: lstm.wf ...
### Compiling layer group: lstm.wf ... complete.
### Compiling layer group: fc ...
### Compiling layer group: fc ... complete.

### Allocating external memory buffers:

          offset_name          offset_address    allocated_space 
    _______________________    ______________    ________________

    "InputDataOffset"           "0x00000000"     "4.0 MB"        
    "OutputResultOffset"        "0x00400000"     "4.0 MB"        
    "SchedulerDataOffset"       "0x00800000"     "4.0 MB"        
    "SystemBufferOffset"        "0x00c00000"     "20.0 MB"       
    "InstructionDataOffset"     "0x02000000"     "4.0 MB"        
    "FCWeightDataOffset"        "0x02400000"     "4.0 MB"        
    "EndOffset"                 "0x02800000"     "Total: 40.0 MB"

### Network compilation complete.
dn = struct with fields:
             weights: [1×1 struct]
        instructions: [1×1 struct]
           registers: [1×1 struct]
    syncInstructions: [1×1 struct]
        constantData: {}
             ddrInfo: [1×1 struct]

Program Bitstream onto FPGA and Download Network Weights

To deploy the network on the Xilinx ZCU102 SoC hardware, run the deploy function of the dlhdl.Workflow object. This function uses the output of the compile function to program the FPGA board by using the programming file. It also downloads the network weights and biases. The deploy function starts programming the FPGA device and displays progress messages, and the required time to deploy the network.

 deploy(hW)
### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA.
### Deep learning network programming has been skipped as the same network is already loaded on the target FPGA.

Test Network

Prepare the test data for prediction. Normalize the test data using the statistics calculated from the training data. To forecast the values of future time steps of a sequence, specify the targets as the test sequences with values shifted by one time step. In other words, at each time step of the input sequence, the LSTM network learns to predict the value of the next time step. The predictors as the test sequences without the final time step.

load Waveformdata
numChannels = size(data{1},1);
numObservations = numel(data);

idxTrain = 1:floor(0.9*numObservations);
idxTest = floor(0.9*numObservations)+1:numObservations;
dataTrain = data(idxTrain);
dataTest = data(idxTest);

for n = 1:numel(dataTrain)
    X = dataTrain{n};
    XTrain{n} = X(:,1:end-1);
    TTrain{n} = X(:,2:end);
end

muX = mean(cat(2,XTrain{:}),2);
sigmaX = std(cat(2,XTrain{:}),0,2);
muT = mean(cat(2,TTrain{:}),2);
sigmaT = std(cat(2,TTrain{:}),0,2);

for n = 1:size(dataTest,1)
    X = dataTest{n};
    XTest{n} = (X(:,1:end-1) - muX) ./ sigmaX;
    TTest{n} = (X(:,2:end) - muT) ./ sigmaT;
end

Make predictions using the test data.

YTest = hW.predict(XTest{1},Profile ='on');
### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 115.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33244                  0.00015                     115            3839434           6589.5
    memSeparator_0              88                  0.00000 
    lstm.wi                   7628                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7509                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             241                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         476                  0.00000 
 * The clock frequency of the DL processor is: 220MHz

To evaluate the accuracy, calculate the root mean squared error (RMSE) between the predictions and the target for each test sequence.

for i = 1:size(YTest,1)
    rmse(i) = sqrt(mean((YTest(i) - TTest{1}(i)).^2,"all"));
end

Visualize the errors in a histogram. Lower values indicate greater accuracy.

figure
histogram(rmse)
xlabel("RMSE")
ylabel("Frequency")

Calculate the mean RMSE over all test observations.

mean(rmse)
ans = single
    0.8385

Forecast Future Time Steps

To forecast the values of multiple future time steps, when given an input time series or sequence, use the predictAndUpdateState function. This function predicts time steps one at a time and updates the network state at each prediction. For each prediction, use the previous prediction as the input to the function.

Visualize one of the test sequences in a plot.

idx = 2;
X = XTest{idx};
T = TTest{idx};

figure
stackedplot(X',DisplayLabels="Channel " + (1:numChannels))
xlabel("Time Step")
title("Test Observation " + idx)

Open-Loop Forecasting

Open-loop forecasting predicts the next time step in a sequence using only the input data. When making predictions for subsequent time steps, you collect the true values form your data source and use those as input. For example, suppose that you want to predict the value for time step t of a sequence by using data collected in time steps 1 through t-1. To make predictions for time step t+1, wait until you record the true value for time step t and use that value as input to make the next prediction. Use open-loop forecasting when you have true values to provide to the network before making the next prediction.

Initialize the network state by resetting the state using the resetState function, then make an initial prediction using the first few time steps of the input data. Update the network state by using the first 75 time steps of the input data.

resetState(hW)
offset = 75;
[~,~] = hW.predictAndUpdateState(X(:,1:offset),Profile='on'); 
### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 75.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33075                  0.00015                      75            2502871           6592.4
    memSeparator_0              88                  0.00000 
    lstm.wi                   7578                  0.00003 
    lstm.wo                   7459                  0.00003 
    lstm.wg                   7539                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       334                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz

To forecast further predictions, loop over time steps and update the network state by using the predictAndUpdateState function. Forecast values for the remaining time steps of the test observation by looping over the time steps of the input data and using them as input to the network. The first prediction is the value that corresponds to the time step offset + 1.

numTimeSteps = size(X,2);
numPredictionTimeSteps = numTimeSteps - offset;
Y = zeros(numChannels,numPredictionTimeSteps);

for t = 1:numPredictionTimeSteps
    Xt = X(:,offset+t);
    Y(:,t) = predictAndUpdateState(hW,Xt,Profile='on');
end
### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33259                  0.00015                       1              34747           6331.5
    memSeparator_0              91                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7619                  0.00003 
    lstm.wg                   7509                  0.00003 
    lstm.wf                   7639                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       384                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33076                  0.00015                       1              34559           6365.9
    memSeparator_0              88                  0.00000 
    lstm.wi                   7530                  0.00003 
    lstm.wo                   7589                  0.00003 
    lstm.wg                   7548                  0.00003 
    lstm.wf                   7559                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 328                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33223                  0.00015                       1              34661           6347.2
    memSeparator_0              95                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7499                  0.00003 
    lstm.wg                   7539                  0.00003 
    lstm.wf                   7648                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       384                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33051                  0.00015                       1              34480           6380.5
    memSeparator_0              84                  0.00000 
    lstm.wi                   7460                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7558                  0.00003 
    lstm.sigmoid_1             231                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             304                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33049                  0.00015                       1              34529           6371.5
    memSeparator_0              90                  0.00000 
    lstm.wi                   7510                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7439                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       344                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33175                  0.00015                       1              34652           6348.8
    memSeparator_0              88                  0.00000 
    lstm.wi                   7637                  0.00003 
    lstm.wo                   7551                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             231                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       364                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33104                  0.00015                       1              34532           6370.9
    memSeparator_0              95                  0.00000 
    lstm.wi                   7640                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7469                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33271                  0.00015                       1              34702           6339.7
    memSeparator_0              92                  0.00000 
    lstm.wi                   7640                  0.00003 
    lstm.wo                   7509                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       368                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33072                  0.00015                       1              34506           6375.7
    memSeparator_0             114                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7538                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                264                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       298                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33197                  0.00015                       1              34662           6347.0
    memSeparator_0              90                  0.00000 
    lstm.wi                   7638                  0.00003 
    lstm.wo                   7550                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       344                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33136                  0.00015                       1              34650           6349.2
    memSeparator_0              88                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7519                  0.00003 
    lstm.wg                   7548                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             268                  0.00000 
    lstm.sigmoid_3             218                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                258                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33150                  0.00015                       1              34575           6363.0
    memSeparator_0              91                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7449                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       368                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33154                  0.00015                       1              34587           6360.8
    memSeparator_0              96                  0.00000 
    lstm.wi                   7570                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7568                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             274                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33255                  0.00015                       1              34729           6334.8
    memSeparator_0              97                  0.00000 
    lstm.wi                   7620                  0.00003 
    lstm.wo                   7478                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7679                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       344                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         436                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33265                  0.00015                       1              34688           6342.3
    memSeparator_0              97                  0.00000 
    lstm.wi                   7650                  0.00003 
    lstm.wo                   7548                  0.00003 
    lstm.wg                   7599                  0.00003 
    lstm.wf                   7559                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             304                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33090                  0.00015                       1              34525           6372.2
    memSeparator_0              81                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7619                  0.00003 
    lstm.wf                   7519                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33177                  0.00015                       1              34624           6354.0
    memSeparator_0              91                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7508                  0.00003 
    lstm.wg                   7539                  0.00003 
    lstm.wf                   7638                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       374                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33297                  0.00015                       1              34715           6337.3
    memSeparator_0              90                  0.00000 
    lstm.wi                   7620                  0.00003 
    lstm.wo                   7548                  0.00003 
    lstm.wg                   7568                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             246                  0.00000 
    lstm.multiplication_2       322                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         476                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33201                  0.00015                       1              34628           6353.2
    memSeparator_0              93                  0.00000 
    lstm.wi                   7650                  0.00003 
    lstm.wo                   7489                  0.00003 
    lstm.wg                   7598                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       328                  0.00000 
    fc                         416                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33384                  0.00015                       1              34809           6320.2
    memSeparator_0              96                  0.00000 
    lstm.wi                   7660                  0.00003 
    lstm.wo                   7539                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7568                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             314                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                258                  0.00000 
    lstm.multiplication_3       378                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33187                  0.00015                       1              34614           6355.8
    memSeparator_0              88                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7619                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                264                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       328                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33230                  0.00015                       1              34714           6337.5
    memSeparator_0              91                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7589                  0.00003 
    lstm.wg                   7509                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             292                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 358                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33219                  0.00015                       1              34647           6349.8
    memSeparator_0              90                  0.00000 
    lstm.wi                   7510                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7599                  0.00003 
    lstm.wf                   7549                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             314                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                258                  0.00000 
    lstm.multiplication_3       348                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33262                  0.00015                       1              34719           6336.6
    memSeparator_0              94                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7598                  0.00003 
    lstm.wg                   7499                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             302                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 368                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33158                  0.00015                       1              34653           6348.7
    memSeparator_0              89                  0.00000 
    lstm.wi                   7639                  0.00003 
    lstm.wo                   7550                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7639                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       354                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33105                  0.00015                       1              34531           6371.1
    memSeparator_0              97                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7519                  0.00003 
    lstm.wf                   7608                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                264                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33107                  0.00015                       1              34537           6370.0
    memSeparator_0              91                  0.00000 
    lstm.wi                   7520                  0.00003 
    lstm.wo                   7578                  0.00003 
    lstm.wg                   7548                  0.00003 
    lstm.wf                   7568                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 348                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         416                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33145                  0.00015                       1              34574           6363.2
    memSeparator_0              87                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7469                  0.00003 
    lstm.wg                   7599                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       358                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33165                  0.00015                       1              34672           6345.2
    memSeparator_0              86                  0.00000 
    lstm.wi                   7530                  0.00003 
    lstm.wo                   7599                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7549                  0.00003 
    lstm.sigmoid_1             242                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 358                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33144                  0.00015                       1              34588           6360.6
    memSeparator_0              95                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7509                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       334                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33266                  0.00015                       1              34741           6332.6
    memSeparator_0              97                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7579                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       354                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33249                  0.00015                       1              34748           6331.3
    memSeparator_0              90                  0.00000 
    lstm.wi                   7600                  0.00003 
    lstm.wo                   7589                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7559                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             244                  0.00000 
    lstm.multiplication_2       324                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 348                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33219                  0.00015                       1              34643           6350.5
    memSeparator_0              90                  0.00000 
    lstm.wi                   7630                  0.00003 
    lstm.wo                   7439                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7669                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       344                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         456                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33164                  0.00015                       1              34592           6359.9
    memSeparator_0              95                  0.00000 
    lstm.wi                   7650                  0.00003 
    lstm.wo                   7539                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
<st...

Compare the predictions with the target values.

figure
t = tiledlayout(numChannels,1);
title(t,"Open Loop Forecasting with LSTM layer")

for i = 1:numChannels
    nexttile
    plot(T(i,:))
    hold on
    plot(offset:numTimeSteps,[T(i,offset) Y(i,:)],'--')
    ylabel("Channel " + i)
end

xlabel("Time Step")
nexttile(1)
legend(["Input" "Forecasted"])

Closed-Loop Forecasting

Closed-loop forecasting predicts subsequent time steps in a sequence by using the previous predictions as input. In this case, the model does not require the true values to make the prediction. For example, suppose that you want to predict the value for time steps t through t+k of the sequence by using data collected in time steps 1 through t-1. To make predictions for time step i, use the predicted value for time step i-1 as input. Use closed-loop forecasting to forecast multiple subsequent time steps or when you do not have true values to provide to the network before making the next prediction.

Initialize the network state by resetting the state using the resetState function, then make an initial prediction, Z, using the first few time steps of the input data. Update the network state by using the first 75 time steps of the input data.

resetState(hW)
offset = size(X,2);
[Z, ~] = predictAndUpdateState(hW,X,Profile='on');
### Resetting network state.
### Finished writing input activations.
### Running a sequence of length 191.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33244                  0.00015                     191            6372755           6593.7
    memSeparator_0              88                  0.00000 
    lstm.wi                   7648                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7568                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             314                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz

To forecast further predictions, loop over time steps and update the network state by using the predictAndUpdateState function. Forecast the next 200 time steps by iteratively passing the previously predicted value to the network. Because the network does not require the input data to make any further predictions, you can specify any number of time steps to forecast.

numPredictionTimeSteps = 200;
Xt = Z(:,end);
Y = zeros(numChannels,numPredictionTimeSteps);

for t = 1:numPredictionTimeSteps    
    [Y(:,t),~] =  predictAndUpdateState(hW,Xt,Profile='on');
    Xt = Y(:,t);   
end
### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33103                  0.00015                       1              34533           6370.7
    memSeparator_0              86                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7548                  0.00003 
    lstm.wg                   7608                  0.00003 
    lstm.wf                   7539                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33304                  0.00015                       1              34799           6322.0
    memSeparator_0              96                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7589                  0.00003 
    lstm.wg                   7569                  0.00003 
    lstm.wf                   7608                  0.00003 
    lstm.sigmoid_1             282                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 328                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33236                  0.00015                       1              34663           6346.8
    memSeparator_0              89                  0.00000 
    lstm.wi                   7650                  0.00003 
    lstm.wo                   7459                  0.00003 
    lstm.wg                   7588                  0.00003 
    lstm.wf                   7608                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       378                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33145                  0.00015                       1              34577           6362.6
    memSeparator_0              87                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7618                  0.00003 
    lstm.wf                   7529                  0.00003 
    lstm.sigmoid_1             242                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33053                  0.00015                       1              34476           6381.3
    memSeparator_0              84                  0.00000 
    lstm.wi                   7560                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7609                  0.00003 
    lstm.wf                   7539                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33190                  0.00015                       1              34673           6345.0
    memSeparator_0              91                  0.00000 
    lstm.wi                   7636                  0.00003 
    lstm.wo                   7553                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       344                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33040                  0.00015                       1              34541           6369.2
    memSeparator_0              92                  0.00000 
    lstm.wi                   7570                  0.00003 
    lstm.wo                   7448                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             244                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33228                  0.00015                       1              34723           6335.9
    memSeparator_0              89                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7519                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       354                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         416                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33193                  0.00015                       1              34672           6345.2
    memSeparator_0              95                  0.00000 
    lstm.wi                   7600                  0.00003 
    lstm.wo                   7589                  0.00003 
    lstm.wg                   7539                  0.00003 
    lstm.wf                   7558                  0.00003 
    lstm.sigmoid_1             242                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 358                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33261                  0.00015                       1              34738           6333.1
    memSeparator_0              93                  0.00000 
    lstm.wi                   7630                  0.00003 
    lstm.wo                   7479                  0.00003 
    lstm.wg                   7539                  0.00003 
    lstm.wf                   7658                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             244                  0.00000 
    lstm.multiplication_2       344                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         436                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33279                  0.00015                       1              34767           6327.8
    memSeparator_0              91                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7588                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             292                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 378                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33186                  0.00015                       1              34609           6356.7
    memSeparator_0              88                  0.00000 
    lstm.wi                   7650                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7449                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             241                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             237                  0.00000 
    lstm.multiplication_2       321                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         476                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33170                  0.00015                       1              34605           6357.5
    memSeparator_0              92                  0.00000 
    lstm.wi                   7640                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7468                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             240                  0.00000 
    lstm.multiplication_2       318                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         466                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33200                  0.00015                       1              34629           6353.1
    memSeparator_0              91                  0.00000 
    lstm.wi                   7630                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7579                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             260                  0.00000 
    lstm.multiplication_2       298                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         416                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33029                  0.00015                       1              34514           6374.2
    memSeparator_0              91                  0.00000 
    lstm.wi                   7500                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7448                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       334                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33258                  0.00015                       1              34749           6331.1
    memSeparator_0              89                  0.00000 
    lstm.wi                   7620                  0.00003 
    lstm.wo                   7489                  0.00003 
    lstm.wg                   7539                  0.00003 
    lstm.wf                   7659                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       364                  0.00000 
    lstm.multiplication_1       314                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         446                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33281                  0.00015                       1              34708           6338.6
    memSeparator_0              94                  0.00000 
    lstm.wi                   7630                  0.00003 
    lstm.wo                   7539                  0.00003 
    lstm.wg                   7568                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             241                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             238                  0.00000 
    lstm.multiplication_2       320                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         476                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33034                  0.00015                       1              34465           6383.3
    memSeparator_0              85                  0.00000 
    lstm.wi                   7460                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7589                  0.00003 
    lstm.wf                   7559                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             294                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33181                  0.00015                       1              34660           6347.4
    memSeparator_0              93                  0.00000 
    lstm.wi                   7622                  0.00003 
    lstm.wo                   7547                  0.00003 
    lstm.wg                   7468                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       374                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33128                  0.00015                       1              34615           6355.6
    memSeparator_0              90                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7538                  0.00003 
    lstm.wg                   7459                  0.00003 
    lstm.wf                   7639                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       374                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33149                  0.00015                       1              34580           6362.1
    memSeparator_0             121                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7619                  0.00003 
    lstm.wf                   7519                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33231                  0.00015                       1              34656           6348.1
    memSeparator_0              92                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7619                  0.00003 
    lstm.wf                   7529                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             244                  0.00000 
    lstm.multiplication_2       324                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                298                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33181                  0.00015                       1              34611           6356.4
    memSeparator_0              93                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7499                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             241                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       364                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33110                  0.00015                       1              34539           6369.6
    memSeparator_0              92                  0.00000 
    lstm.wi                   7640                  0.00003 
    lstm.wo                   7548                  0.00003 
    lstm.wg                   7469                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             232                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         426                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33217                  0.00015                       1              34641           6350.9
    memSeparator_0              88                  0.00000 
    lstm.wi                   7630                  0.00003 
    lstm.wo                   7439                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7669                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       344                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         456                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33068                  0.00015                       1              34545           6368.5
    memSeparator_0              90                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7448                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             259                  0.00000 
    lstm.sigmoid_3             217                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         416                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33136                  0.00015                       1              34565           6364.8
    memSeparator_0             107                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7619                  0.00003 
    lstm.wf                   7519                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33074                  0.00015                       1              34580           6362.1
    memSeparator_0              87                  0.00000 
    lstm.wi                   7510                  0.00003 
    lstm.wo                   7539                  0.00003 
    lstm.wg                   7448                  0.00003 
    lstm.wf                   7639                  0.00003 
    lstm.sigmoid_1             241                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       374                  0.00000 
    lstm.c_add                 308                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33226                  0.00015                       1              34649           6349.4
    memSeparator_0              89                  0.00000 
    lstm.wi                   7650                  0.00003 
    lstm.wo                   7539                  0.00003 
    lstm.wg                   7588                  0.00003 
    lstm.wf                   7569                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             274                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33086                  0.00015                       1              34528           6371.6
    memSeparator_0              88                  0.00000 
    lstm.wi                   7570                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7439                  0.00003 
    lstm.wf                   7649                  0.00003 
    lstm.sigmoid_1             231                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       364                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33080                  0.00015                       1              34502           6376.4
    memSeparator_0              92                  0.00000 
    lstm.wi                   7590                  0.00003 
    lstm.wo                   7439                  0.00003 
    lstm.wg                   7598                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             222                  0.00000 
    lstm.sigmoid_3             214                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             234                  0.00000 
    lstm.multiplication_2       314                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                228                  0.00000 
    lstm.multiplication_3       328                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33135                  0.00015                       1              34590           6360.2
    memSeparator_0              86                  0.00000 
    lstm.wi                   7580                  0.00003 
    lstm.wo                   7529                  0.00003 
    lstm.wg                   7549                  0.00003 
    lstm.wf                   7609                  0.00003 
    lstm.sigmoid_1             248                  0.00000 
    lstm.sigmoid_3             218                  0.00000 
    lstm.tanh_1                244                  0.00000 
    lstm.sigmoid_2             224                  0.00000 
    lstm.multiplication_2       294                  0.00000 
    lstm.multiplication_1       294                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                258                  0.00000 
    lstm.multiplication_3       308                  0.00000 
    fc                         406                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33261                  0.00015                       1              34683           6343.2
    memSeparator_0              95                  0.00000 
    lstm.wi                   7620                  0.00003 
    lstm.wo                   7548                  0.00003 
    lstm.wg                   7558                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             241                  0.00000 
    lstm.sigmoid_3             224                  0.00000 
    lstm.tanh_1                234                  0.00000 
    lstm.sigmoid_2             235                  0.00000 
    lstm.multiplication_2       293                  0.00000 
    lstm.multiplication_1       324                  0.00000 
    lstm.c_add                 288                  0.00000 
    lstm.tanh_2                238                  0.00000 
    lstm.multiplication_3       288                  0.00000 
    fc                         476                  0.00000 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running a sequence of length 1.


              Deep Learning Processor Profiler Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                      33310                  0.00015                       1              34727           6335.1
    memSeparator_0              93                  0.00000 
    lstm.wi                   7630                  0.00003 
    lstm.wo                   7549                  0.00003 
    lstm.wg                   7568                  0.00003 
    lstm.wf                   7599                  0.00003 
    lstm.sigmoid_1             221                  0.00000 
<st...

Visualize the forecasted values in a plot.

numTimeSteps = offset + numPredictionTimeSteps;

figure
t = tiledlayout(numChannels,1);
title(t,"Closed Loop Forecasting with LSTM layer")

for i = 1:numChannels
    nexttile
    plot(T(i,1:offset))
    hold on
    plot(offset:numTimeSteps,[T(i,offset) Y(i,:)],'--')
    ylabel("Channel " + i)
end

xlabel("Time Step")
nexttile(1)
legend(["Input" "Forecasted"])

Closed-loop forecasting allows you to forecast an arbitrary number of time steps, but can be less accurate when compared to open-loop forecasting because the network does not have access to the true values during the forecasting process.

See Also

| | | | | |

Related Topics