Run Sequence Forecasting on FPGA by Using Deep Learning HDL Toolbox
This example shows how to create, compile, and deploy a long short-term memory (LSTM) network trained on waveform data by using the Deep Learning HDL Toolbox™ Support Package for Xilinx FPGA and SoC. Use the deployed network to predict future values by using open-loop and closed-loop forecasting. Use MATLAB® to retrieve the prediction results from the target device.
Waveform Data Network
The network attached to this example was trained using the Time Series Forecasting Using Deep Learning. This example uses the WaveformData.mat
data set, which contains 2000 synthetically generated waveforms of varying lengths with three channels. This example uses a trained LSTM network to forecast future values of the waveforms given the values from the previous time steps using both closed loop and open loop forecasting.
Prerequisites
Xilinx® Zynq® Ultrascale+™ ZCU102 SoC development kit
Deep Learning HDL Toolbox™ Support Package for Xilinx FPGA and SoC
Deep Learning Toolbox™
Deep Learning HDL Toolbox™
Load the Pretrained Network
To load the LSTM network enter:
load WaveformForcastingNet
Use the analyzeNetwork
function to obtain information about the network layers. the function returns a graphical representation of the network that contains detailed parameter information for every layer in the network.
analyzeNetwork(net)
Define FPGA Board Interface
Define the target FPGA board programming interface by using the dlhdl.Target
object. Specify that the interface is for a Xilinx board with an Ethernet interface.
To create the target object, enter:
hTarget = dlhdl.Target('Xilinx','Interface','Ethernet');
To use the JTAG interface, install Xilinx™ Vivado™ Design Suite 2022.1. To set the Xilinx Vivado toolpath, enter:
hdlsetuptoolpath('ToolName', 'Xilinx Vivado', 'ToolPath', 'C:\Xilinx\Vivado\2022.1\bin\vivado.bat'); hTarget = dlhdl.Target('Xilinx','Interface','JTAG');
Prepare Network for Deployment
Prepare the network for deployment by creating a dlhdl.Workflow
object. Specify the network and the bitstream name. Ensure that the bitstream name matches the data type and the FPGA board. In this example the target FPGA board is the Xilinx ZCU102 SOC board. The bitstream uses a single data type.
hW = dlhdl.Workflow('network', net, 'Bitstream', 'zcu102_lstm_single','Target',hTarget);
Tu run the example on the Xilinx ZC706 board, enter:
hW = dlhdl.Workflow('Network', snet, 'Bitstream', 'zc706_lstm_single','Target',hTarget);
Compile the LSTM Network
Run the compile
method of the dlhdl.Workflow
object to compile the network and generate the instructions, weights, and biases for deployment. The total number of frames exceeds the default value of 30. Set the InputFrameNumberLimit
name-value argument to 1000
to run predictions in chunks of 1000 frames to prevent timeouts.
dn = compile(hW,'InputFrameNumberLimit',1000)
### Compiling network for Deep Learning FPGA prototyping ... ### Targeting FPGA bitstream zcu102_lstm_single. ### The network includes the following layers: 1 'sequenceinput' Sequence Input Sequence input with 3 dimensions (SW Layer) 2 'lstm' LSTM LSTM with 128 hidden units (HW Layer) 3 'fc' Fully Connected 3 fully connected layer (HW Layer) 4 'regressionoutput' Regression Output mean-squared-error with response 'Response' (SW Layer) ### Notice: The layer 'sequenceinput' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software. ### Notice: The layer 'regressionoutput' with type 'nnet.cnn.layer.RegressionOutputLayer' is implemented in software. ### Compiling layer group: lstm.wi ... ### Compiling layer group: lstm.wi ... complete. ### Compiling layer group: lstm.wo ... ### Compiling layer group: lstm.wo ... complete. ### Compiling layer group: lstm.wg ... ### Compiling layer group: lstm.wg ... complete. ### Compiling layer group: lstm.wf ... ### Compiling layer group: lstm.wf ... complete. ### Compiling layer group: fc ... ### Compiling layer group: fc ... complete. ### Allocating external memory buffers: offset_name offset_address allocated_space _______________________ ______________ ________________ "InputDataOffset" "0x00000000" "4.0 MB" "OutputResultOffset" "0x00400000" "4.0 MB" "SchedulerDataOffset" "0x00800000" "4.0 MB" "SystemBufferOffset" "0x00c00000" "20.0 MB" "InstructionDataOffset" "0x02000000" "4.0 MB" "FCWeightDataOffset" "0x02400000" "4.0 MB" "EndOffset" "0x02800000" "Total: 40.0 MB" ### Network compilation complete.
dn = struct with fields:
weights: [1×1 struct]
instructions: [1×1 struct]
registers: [1×1 struct]
syncInstructions: [1×1 struct]
constantData: {}
ddrInfo: [1×1 struct]
Program Bitstream onto FPGA and Download Network Weights
To deploy the network on the Xilinx ZCU102 SoC hardware, run the deploy
function of the dlhdl.Workflow
object. This function uses the output of the compile
function to program the FPGA board by using the programming file. It also downloads the network weights and biases. The deploy
function starts programming the FPGA device and displays progress messages, and the required time to deploy the network.
deploy(hW)
### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA. ### Deep learning network programming has been skipped as the same network is already loaded on the target FPGA.
Test Network
Prepare the test data for prediction. Normalize the test data using the statistics calculated from the training data. To forecast the values of future time steps of a sequence, specify the targets as the test sequences with values shifted by one time step. In other words, at each time step of the input sequence, the LSTM network learns to predict the value of the next time step. The predictors as the test sequences without the final time step.
load Waveformdata numChannels = size(data{1},1); numObservations = numel(data); idxTrain = 1:floor(0.9*numObservations); idxTest = floor(0.9*numObservations)+1:numObservations; dataTrain = data(idxTrain); dataTest = data(idxTest); for n = 1:numel(dataTrain) X = dataTrain{n}; XTrain{n} = X(:,1:end-1); TTrain{n} = X(:,2:end); end muX = mean(cat(2,XTrain{:}),2); sigmaX = std(cat(2,XTrain{:}),0,2); muT = mean(cat(2,TTrain{:}),2); sigmaT = std(cat(2,TTrain{:}),0,2); for n = 1:size(dataTest,1) X = dataTest{n}; XTest{n} = (X(:,1:end-1) - muX) ./ sigmaX; TTest{n} = (X(:,2:end) - muT) ./ sigmaT; end
Make predictions using the test data.
YTest = hW.predict(XTest{1},Profile ='on');
### Resetting network state. ### Finished writing input activations. ### Running a sequence of length 115. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33244 0.00015 115 3839434 6589.5 memSeparator_0 88 0.00000 lstm.wi 7628 0.00003 lstm.wo 7549 0.00003 lstm.wg 7509 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 241 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 308 0.00000 fc 476 0.00000 * The clock frequency of the DL processor is: 220MHz
To evaluate the accuracy, calculate the root mean squared error (RMSE) between the predictions and the target for each test sequence.
for i = 1:size(YTest,1) rmse(i) = sqrt(mean((YTest(i) - TTest{1}(i)).^2,"all")); end
Visualize the errors in a histogram. Lower values indicate greater accuracy.
figure histogram(rmse) xlabel("RMSE") ylabel("Frequency")
Calculate the mean RMSE over all test observations.
mean(rmse)
ans = single
0.8385
Forecast Future Time Steps
To forecast the values of multiple future time steps, when given an input time series or sequence, use the predictAndUpdateState
function. This function predicts time steps one at a time and updates the network state at each prediction. For each prediction, use the previous prediction as the input to the function.
Visualize one of the test sequences in a plot.
idx = 2; X = XTest{idx}; T = TTest{idx}; figure stackedplot(X',DisplayLabels="Channel " + (1:numChannels)) xlabel("Time Step") title("Test Observation " + idx)
Open-Loop Forecasting
Open-loop forecasting predicts the next time step in a sequence using only the input data. When making predictions for subsequent time steps, you collect the true values form your data source and use those as input. For example, suppose that you want to predict the value for time step of a sequence by using data collected in time steps 1 through . To make predictions for time step , wait until you record the true value for time step and use that value as input to make the next prediction. Use open-loop forecasting when you have true values to provide to the network before making the next prediction.
Initialize the network state by resetting the state using the resetState
function, then make an initial prediction using the first few time steps of the input data. Update the network state by using the first 75 time steps of the input data.
resetState(hW)
offset = 75;
[~,~] = hW.predictAndUpdateState(X(:,1:offset),Profile='on');
### Resetting network state. ### Finished writing input activations. ### Running a sequence of length 75. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33075 0.00015 75 2502871 6592.4 memSeparator_0 88 0.00000 lstm.wi 7578 0.00003 lstm.wo 7459 0.00003 lstm.wg 7539 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 334 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz
To forecast further predictions, loop over time steps and update the network state by using the predictAndUpdateState
function. Forecast values for the remaining time steps of the test observation by looping over the time steps of the input data and using them as input to the network. The first prediction is the value that corresponds to the time step offset + 1
.
numTimeSteps = size(X,2); numPredictionTimeSteps = numTimeSteps - offset; Y = zeros(numChannels,numPredictionTimeSteps); for t = 1:numPredictionTimeSteps Xt = X(:,offset+t); Y(:,t) = predictAndUpdateState(hW,Xt,Profile='on'); end
### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33259 0.00015 1 34747 6331.5 memSeparator_0 91 0.00000 lstm.wi 7580 0.00003 lstm.wo 7619 0.00003 lstm.wg 7509 0.00003 lstm.wf 7639 0.00003 lstm.sigmoid_1 221 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 384 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33076 0.00015 1 34559 6365.9 memSeparator_0 88 0.00000 lstm.wi 7530 0.00003 lstm.wo 7589 0.00003 lstm.wg 7548 0.00003 lstm.wf 7559 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 328 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33223 0.00015 1 34661 6347.2 memSeparator_0 95 0.00000 lstm.wi 7590 0.00003 lstm.wo 7499 0.00003 lstm.wg 7539 0.00003 lstm.wf 7648 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 384 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33051 0.00015 1 34480 6380.5 memSeparator_0 84 0.00000 lstm.wi 7460 0.00003 lstm.wo 7549 0.00003 lstm.wg 7589 0.00003 lstm.wf 7558 0.00003 lstm.sigmoid_1 231 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 304 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33049 0.00015 1 34529 6371.5 memSeparator_0 90 0.00000 lstm.wi 7510 0.00003 lstm.wo 7549 0.00003 lstm.wg 7439 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 344 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33175 0.00015 1 34652 6348.8 memSeparator_0 88 0.00000 lstm.wi 7637 0.00003 lstm.wo 7551 0.00003 lstm.wg 7459 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 231 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 364 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33104 0.00015 1 34532 6370.9 memSeparator_0 95 0.00000 lstm.wi 7640 0.00003 lstm.wo 7549 0.00003 lstm.wg 7469 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33271 0.00015 1 34702 6339.7 memSeparator_0 92 0.00000 lstm.wi 7640 0.00003 lstm.wo 7509 0.00003 lstm.wg 7589 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 368 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33072 0.00015 1 34506 6375.7 memSeparator_0 114 0.00000 lstm.wi 7590 0.00003 lstm.wo 7538 0.00003 lstm.wg 7459 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 264 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 298 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33197 0.00015 1 34662 6347.0 memSeparator_0 90 0.00000 lstm.wi 7638 0.00003 lstm.wo 7550 0.00003 lstm.wg 7459 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 221 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 344 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33136 0.00015 1 34650 6349.2 memSeparator_0 88 0.00000 lstm.wi 7580 0.00003 lstm.wo 7519 0.00003 lstm.wg 7548 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 268 0.00000 lstm.sigmoid_3 218 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 258 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33150 0.00015 1 34575 6363.0 memSeparator_0 91 0.00000 lstm.wi 7580 0.00003 lstm.wo 7449 0.00003 lstm.wg 7589 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 368 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33154 0.00015 1 34587 6360.8 memSeparator_0 96 0.00000 lstm.wi 7570 0.00003 lstm.wo 7549 0.00003 lstm.wg 7589 0.00003 lstm.wf 7568 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 274 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33255 0.00015 1 34729 6334.8 memSeparator_0 97 0.00000 lstm.wi 7620 0.00003 lstm.wo 7478 0.00003 lstm.wg 7549 0.00003 lstm.wf 7679 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 344 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 436 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33265 0.00015 1 34688 6342.3 memSeparator_0 97 0.00000 lstm.wi 7650 0.00003 lstm.wo 7548 0.00003 lstm.wg 7599 0.00003 lstm.wf 7559 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 304 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33090 0.00015 1 34525 6372.2 memSeparator_0 81 0.00000 lstm.wi 7590 0.00003 lstm.wo 7549 0.00003 lstm.wg 7619 0.00003 lstm.wf 7519 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33177 0.00015 1 34624 6354.0 memSeparator_0 91 0.00000 lstm.wi 7580 0.00003 lstm.wo 7508 0.00003 lstm.wg 7539 0.00003 lstm.wf 7638 0.00003 lstm.sigmoid_1 221 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 374 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33297 0.00015 1 34715 6337.3 memSeparator_0 90 0.00000 lstm.wi 7620 0.00003 lstm.wo 7548 0.00003 lstm.wg 7568 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 246 0.00000 lstm.multiplication_2 322 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 476 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33201 0.00015 1 34628 6353.2 memSeparator_0 93 0.00000 lstm.wi 7650 0.00003 lstm.wo 7489 0.00003 lstm.wg 7598 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 328 0.00000 fc 416 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33384 0.00015 1 34809 6320.2 memSeparator_0 96 0.00000 lstm.wi 7660 0.00003 lstm.wo 7539 0.00003 lstm.wg 7589 0.00003 lstm.wf 7568 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 314 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 258 0.00000 lstm.multiplication_3 378 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33187 0.00015 1 34614 6355.8 memSeparator_0 88 0.00000 lstm.wi 7580 0.00003 lstm.wo 7549 0.00003 lstm.wg 7589 0.00003 lstm.wf 7619 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 264 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 328 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33230 0.00015 1 34714 6337.5 memSeparator_0 91 0.00000 lstm.wi 7580 0.00003 lstm.wo 7589 0.00003 lstm.wg 7509 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 292 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 358 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33219 0.00015 1 34647 6349.8 memSeparator_0 90 0.00000 lstm.wi 7510 0.00003 lstm.wo 7549 0.00003 lstm.wg 7599 0.00003 lstm.wf 7549 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 314 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 258 0.00000 lstm.multiplication_3 348 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33262 0.00015 1 34719 6336.6 memSeparator_0 94 0.00000 lstm.wi 7580 0.00003 lstm.wo 7598 0.00003 lstm.wg 7499 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 302 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 368 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33158 0.00015 1 34653 6348.7 memSeparator_0 89 0.00000 lstm.wi 7639 0.00003 lstm.wo 7550 0.00003 lstm.wg 7459 0.00003 lstm.wf 7639 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 354 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33105 0.00015 1 34531 6371.1 memSeparator_0 97 0.00000 lstm.wi 7580 0.00003 lstm.wo 7549 0.00003 lstm.wg 7519 0.00003 lstm.wf 7608 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 264 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33107 0.00015 1 34537 6370.0 memSeparator_0 91 0.00000 lstm.wi 7520 0.00003 lstm.wo 7578 0.00003 lstm.wg 7548 0.00003 lstm.wf 7568 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 348 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 416 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33145 0.00015 1 34574 6363.2 memSeparator_0 87 0.00000 lstm.wi 7580 0.00003 lstm.wo 7469 0.00003 lstm.wg 7599 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 221 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 358 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33165 0.00015 1 34672 6345.2 memSeparator_0 86 0.00000 lstm.wi 7530 0.00003 lstm.wo 7599 0.00003 lstm.wg 7549 0.00003 lstm.wf 7549 0.00003 lstm.sigmoid_1 242 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 358 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33144 0.00015 1 34588 6360.6 memSeparator_0 95 0.00000 lstm.wi 7580 0.00003 lstm.wo 7509 0.00003 lstm.wg 7549 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 334 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33266 0.00015 1 34741 6332.6 memSeparator_0 97 0.00000 lstm.wi 7580 0.00003 lstm.wo 7579 0.00003 lstm.wg 7549 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 354 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33249 0.00015 1 34748 6331.3 memSeparator_0 90 0.00000 lstm.wi 7600 0.00003 lstm.wo 7589 0.00003 lstm.wg 7549 0.00003 lstm.wf 7559 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 244 0.00000 lstm.multiplication_2 324 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 348 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33219 0.00015 1 34643 6350.5 memSeparator_0 90 0.00000 lstm.wi 7630 0.00003 lstm.wo 7439 0.00003 lstm.wg 7549 0.00003 lstm.wf 7669 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 344 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 456 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33164 0.00015 1 34592 6359.9 memSeparator_0 95 0.00000 lstm.wi 7650 0.00003 lstm.wo 7539 0.00003 lstm.wg 7459 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 <st...
Compare the predictions with the target values.
figure t = tiledlayout(numChannels,1); title(t,"Open Loop Forecasting with LSTM layer") for i = 1:numChannels nexttile plot(T(i,:)) hold on plot(offset:numTimeSteps,[T(i,offset) Y(i,:)],'--') ylabel("Channel " + i) end xlabel("Time Step") nexttile(1) legend(["Input" "Forecasted"])
Closed-Loop Forecasting
Closed-loop forecasting predicts subsequent time steps in a sequence by using the previous predictions as input. In this case, the model does not require the true values to make the prediction. For example, suppose that you want to predict the value for time steps through of the sequence by using data collected in time steps 1 through . To make predictions for time step , use the predicted value for time step as input. Use closed-loop forecasting to forecast multiple subsequent time steps or when you do not have true values to provide to the network before making the next prediction.
Initialize the network state by resetting the state using the resetState
function, then make an initial prediction, Z,
using the first few time steps of the input data. Update the network state by using the first 75 time steps of the input data.
resetState(hW)
offset = size(X,2);
[Z, ~] = predictAndUpdateState(hW,X,Profile='on');
### Resetting network state. ### Finished writing input activations. ### Running a sequence of length 191. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33244 0.00015 191 6372755 6593.7 memSeparator_0 88 0.00000 lstm.wi 7648 0.00003 lstm.wo 7549 0.00003 lstm.wg 7589 0.00003 lstm.wf 7568 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 314 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz
To forecast further predictions, loop over time steps and update the network state by using the predictAndUpdateState
function. Forecast the next 200 time steps by iteratively passing the previously predicted value to the network. Because the network does not require the input data to make any further predictions, you can specify any number of time steps to forecast.
numPredictionTimeSteps = 200; Xt = Z(:,end); Y = zeros(numChannels,numPredictionTimeSteps); for t = 1:numPredictionTimeSteps [Y(:,t),~] = predictAndUpdateState(hW,Xt,Profile='on'); Xt = Y(:,t); end
### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33103 0.00015 1 34533 6370.7 memSeparator_0 86 0.00000 lstm.wi 7590 0.00003 lstm.wo 7548 0.00003 lstm.wg 7608 0.00003 lstm.wf 7539 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33304 0.00015 1 34799 6322.0 memSeparator_0 96 0.00000 lstm.wi 7580 0.00003 lstm.wo 7589 0.00003 lstm.wg 7569 0.00003 lstm.wf 7608 0.00003 lstm.sigmoid_1 282 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 328 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33236 0.00015 1 34663 6346.8 memSeparator_0 89 0.00000 lstm.wi 7650 0.00003 lstm.wo 7459 0.00003 lstm.wg 7588 0.00003 lstm.wf 7608 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 378 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33145 0.00015 1 34577 6362.6 memSeparator_0 87 0.00000 lstm.wi 7590 0.00003 lstm.wo 7549 0.00003 lstm.wg 7618 0.00003 lstm.wf 7529 0.00003 lstm.sigmoid_1 242 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33053 0.00015 1 34476 6381.3 memSeparator_0 84 0.00000 lstm.wi 7560 0.00003 lstm.wo 7549 0.00003 lstm.wg 7609 0.00003 lstm.wf 7539 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33190 0.00015 1 34673 6345.0 memSeparator_0 91 0.00000 lstm.wi 7636 0.00003 lstm.wo 7553 0.00003 lstm.wg 7459 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 344 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33040 0.00015 1 34541 6369.2 memSeparator_0 92 0.00000 lstm.wi 7570 0.00003 lstm.wo 7448 0.00003 lstm.wg 7549 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 244 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33228 0.00015 1 34723 6335.9 memSeparator_0 89 0.00000 lstm.wi 7580 0.00003 lstm.wo 7519 0.00003 lstm.wg 7549 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 354 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 308 0.00000 fc 416 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33193 0.00015 1 34672 6345.2 memSeparator_0 95 0.00000 lstm.wi 7600 0.00003 lstm.wo 7589 0.00003 lstm.wg 7539 0.00003 lstm.wf 7558 0.00003 lstm.sigmoid_1 242 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 358 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33261 0.00015 1 34738 6333.1 memSeparator_0 93 0.00000 lstm.wi 7630 0.00003 lstm.wo 7479 0.00003 lstm.wg 7539 0.00003 lstm.wf 7658 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 244 0.00000 lstm.multiplication_2 344 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 308 0.00000 fc 436 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33279 0.00015 1 34767 6327.8 memSeparator_0 91 0.00000 lstm.wi 7580 0.00003 lstm.wo 7588 0.00003 lstm.wg 7549 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 292 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 378 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33186 0.00015 1 34609 6356.7 memSeparator_0 88 0.00000 lstm.wi 7650 0.00003 lstm.wo 7549 0.00003 lstm.wg 7449 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 241 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 237 0.00000 lstm.multiplication_2 321 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 476 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33170 0.00015 1 34605 6357.5 memSeparator_0 92 0.00000 lstm.wi 7640 0.00003 lstm.wo 7549 0.00003 lstm.wg 7468 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 240 0.00000 lstm.multiplication_2 318 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 466 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33200 0.00015 1 34629 6353.1 memSeparator_0 91 0.00000 lstm.wi 7630 0.00003 lstm.wo 7549 0.00003 lstm.wg 7579 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 260 0.00000 lstm.multiplication_2 298 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 416 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33029 0.00015 1 34514 6374.2 memSeparator_0 91 0.00000 lstm.wi 7500 0.00003 lstm.wo 7549 0.00003 lstm.wg 7448 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 334 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33258 0.00015 1 34749 6331.1 memSeparator_0 89 0.00000 lstm.wi 7620 0.00003 lstm.wo 7489 0.00003 lstm.wg 7539 0.00003 lstm.wf 7659 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 364 0.00000 lstm.multiplication_1 314 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 288 0.00000 fc 446 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33281 0.00015 1 34708 6338.6 memSeparator_0 94 0.00000 lstm.wi 7630 0.00003 lstm.wo 7539 0.00003 lstm.wg 7568 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 241 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 238 0.00000 lstm.multiplication_2 320 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 476 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33034 0.00015 1 34465 6383.3 memSeparator_0 85 0.00000 lstm.wi 7460 0.00003 lstm.wo 7549 0.00003 lstm.wg 7589 0.00003 lstm.wf 7559 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 294 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33181 0.00015 1 34660 6347.4 memSeparator_0 93 0.00000 lstm.wi 7622 0.00003 lstm.wo 7547 0.00003 lstm.wg 7468 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 374 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33128 0.00015 1 34615 6355.6 memSeparator_0 90 0.00000 lstm.wi 7580 0.00003 lstm.wo 7538 0.00003 lstm.wg 7459 0.00003 lstm.wf 7639 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 374 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33149 0.00015 1 34580 6362.1 memSeparator_0 121 0.00000 lstm.wi 7590 0.00003 lstm.wo 7549 0.00003 lstm.wg 7619 0.00003 lstm.wf 7519 0.00003 lstm.sigmoid_1 221 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33231 0.00015 1 34656 6348.1 memSeparator_0 92 0.00000 lstm.wi 7580 0.00003 lstm.wo 7549 0.00003 lstm.wg 7619 0.00003 lstm.wf 7529 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 244 0.00000 lstm.multiplication_2 324 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 298 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33181 0.00015 1 34611 6356.4 memSeparator_0 93 0.00000 lstm.wi 7590 0.00003 lstm.wo 7499 0.00003 lstm.wg 7549 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 241 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 364 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33110 0.00015 1 34539 6369.6 memSeparator_0 92 0.00000 lstm.wi 7640 0.00003 lstm.wo 7548 0.00003 lstm.wg 7469 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 232 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 426 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33217 0.00015 1 34641 6350.9 memSeparator_0 88 0.00000 lstm.wi 7630 0.00003 lstm.wo 7439 0.00003 lstm.wg 7549 0.00003 lstm.wf 7669 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 344 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 456 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33068 0.00015 1 34545 6368.5 memSeparator_0 90 0.00000 lstm.wi 7580 0.00003 lstm.wo 7448 0.00003 lstm.wg 7549 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 259 0.00000 lstm.sigmoid_3 217 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 308 0.00000 fc 416 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33136 0.00015 1 34565 6364.8 memSeparator_0 107 0.00000 lstm.wi 7590 0.00003 lstm.wo 7549 0.00003 lstm.wg 7619 0.00003 lstm.wf 7519 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33074 0.00015 1 34580 6362.1 memSeparator_0 87 0.00000 lstm.wi 7510 0.00003 lstm.wo 7539 0.00003 lstm.wg 7448 0.00003 lstm.wf 7639 0.00003 lstm.sigmoid_1 241 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 374 0.00000 lstm.c_add 308 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33226 0.00015 1 34649 6349.4 memSeparator_0 89 0.00000 lstm.wi 7650 0.00003 lstm.wo 7539 0.00003 lstm.wg 7588 0.00003 lstm.wf 7569 0.00003 lstm.sigmoid_1 221 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 274 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33086 0.00015 1 34528 6371.6 memSeparator_0 88 0.00000 lstm.wi 7570 0.00003 lstm.wo 7549 0.00003 lstm.wg 7439 0.00003 lstm.wf 7649 0.00003 lstm.sigmoid_1 231 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 364 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33080 0.00015 1 34502 6376.4 memSeparator_0 92 0.00000 lstm.wi 7590 0.00003 lstm.wo 7439 0.00003 lstm.wg 7598 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 222 0.00000 lstm.sigmoid_3 214 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 234 0.00000 lstm.multiplication_2 314 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 228 0.00000 lstm.multiplication_3 328 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33135 0.00015 1 34590 6360.2 memSeparator_0 86 0.00000 lstm.wi 7580 0.00003 lstm.wo 7529 0.00003 lstm.wg 7549 0.00003 lstm.wf 7609 0.00003 lstm.sigmoid_1 248 0.00000 lstm.sigmoid_3 218 0.00000 lstm.tanh_1 244 0.00000 lstm.sigmoid_2 224 0.00000 lstm.multiplication_2 294 0.00000 lstm.multiplication_1 294 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 258 0.00000 lstm.multiplication_3 308 0.00000 fc 406 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33261 0.00015 1 34683 6343.2 memSeparator_0 95 0.00000 lstm.wi 7620 0.00003 lstm.wo 7548 0.00003 lstm.wg 7558 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 241 0.00000 lstm.sigmoid_3 224 0.00000 lstm.tanh_1 234 0.00000 lstm.sigmoid_2 235 0.00000 lstm.multiplication_2 293 0.00000 lstm.multiplication_1 324 0.00000 lstm.c_add 288 0.00000 lstm.tanh_2 238 0.00000 lstm.multiplication_3 288 0.00000 fc 476 0.00000 * The clock frequency of the DL processor is: 220MHz ### Finished writing input activations. ### Running a sequence of length 1. Deep Learning Processor Profiler Performance Results LastFrameLatency(cycles) LastFrameLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 33310 0.00015 1 34727 6335.1 memSeparator_0 93 0.00000 lstm.wi 7630 0.00003 lstm.wo 7549 0.00003 lstm.wg 7568 0.00003 lstm.wf 7599 0.00003 lstm.sigmoid_1 221 0.00000 <st...
Visualize the forecasted values in a plot.
numTimeSteps = offset + numPredictionTimeSteps; figure t = tiledlayout(numChannels,1); title(t,"Closed Loop Forecasting with LSTM layer") for i = 1:numChannels nexttile plot(T(i,1:offset)) hold on plot(offset:numTimeSteps,[T(i,offset) Y(i,:)],'--') ylabel("Channel " + i) end xlabel("Time Step") nexttile(1) legend(["Input" "Forecasted"])
Closed-loop forecasting allows you to forecast an arbitrary number of time steps, but can be less accurate when compared to open-loop forecasting because the network does not have access to the true values during the forecasting process.
See Also
dlhdl.Workflow
| dlhdl.Target
| compile
| deploy
| predict
| predictAndUpdateState
| resetState