Deep Learning Processing of Live Video
The support package includes two reference designs for deep learning (DL) applications. When you use a deep learning reference design, you must specify the name and file location of a deep learning processor core generated by using the Deep Learning HDL Toolbox™ tools.
This page describes the
RGB with DL Processor reference
This reference design feeds live HDMI video to custom preprocessing logic, a DL processor, and custom postprocessing code, and returns the modified HDMI video output from the board. The preprocessing logic and the DL processor are on the FPGA. These two parts of the design communicate control information over an AXI manager interface, and share video data using a second AXI manager interface to DDR memory. The postprocessing logic is on the ARM® processor and reads video data from the same memory. The YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware example shows how to use this reference design, how to model the AXI interfaces and the handshaking logic between the preprocessing logic and the DL processor, and how to model the postprocessing operations.
For a reference design that connects a deep learning processor with custom preprocessing logic and can be controlled from a MATLAB host machine, see Target Deep Learning Processor and Image Preprocessing to FPGA.
This diagram shows the interfaces in the
RGB with DL
Processor reference design.
The FPGA user logic for this reference design must contain two simplified AXI Manager
protocol interfaces. One interface interacts with the DL IP core and the other transfers
data between the FPGA user logic and DDR. The AXI Manager interfaces are the same as
those in the
Deep Learning with Preprocessing Interface
AXI-Lite — The ARM and FPGA parts of the design communicate with each other by using AXI-Lite registers.
AXI4 Manager of DDR — The FPGA user logic writes output data to the PL DDR memory using this interface. The deep learning IP then reads the data for processing.
AXI4 Manager of deep learning IP — The FPGA user logic and the deep learning IP communicate control information over this interface. The FPGA user logic must contain logic for the handshaking protocol of the deep learning IP. The YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware example includes a subsystem that shows how to model this handshake protocol.
In this reference design, the FPGA converts the HDMI input to an RGB
pixelcontrol video stream, and converts the ARM output data back to HDMI format for output.
You can use a Video Capture HDMI block to capture the output video into Simulink®. The video captured is the result of the postprocessing operation in the ARM processor.
The postprocessing operations in the YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware example use annotation blocks that are designed for deploying to the ARM processor. When deployed, these blocks read video frames from external memory, modify the pixel values, and write the modified video frames back to memory. For example, see the Draw Rectangle and Set ROI block reference pages.
- Deploy and Verify YOLO v2 Vehicle Detector on FPGA
- YOLO v2 Vehicle Detector with Live Camera Input on Zynq-Based Hardware