Semantic Segmentation on NVIDIA DRIVE

This example uses:

This example shows how to generate and deploy a CUDA® executable for an image segmentation application that uses deep learning. It uses the MATLAB® Coder™ Support Package for NVIDIA® Jetson™ and NVIDIA DRIVE™ Platforms to deploy the executable on the NVIDIA DRIVE™ platform. This example performs code generation on the host computer and builds the generated code on the target platform by using remote build capability of the support package. For more information, see Code Generation for Semantic Segmentation Network (GPU Coder).

Prerequisites

Target Board Requirements

NVIDIA DRIVE PX2 embedded platform.
Ethernet crossover cable to connect the target board and host PC (if you cannot connect the target board to a local network).
NVIDIA CUDA toolkit installed on the board.
NVIDIA cuDNN library (v5 and above) on the target.
OpenCV library on the target for reading and displaying images.
Environment variables on the target for the compilers and libraries. For more information, see Install and Setup Prerequisites for NVIDIA Boards.

Development Host Requirements

For CUDA code generation, NVIDIA CUDA toolkit on the host and environment variables for the compilers and libraries. For more information, see Third-Party Hardware (GPU Coder) and Setting Up the Prerequisite Products (GPU Coder).

Connect to NVIDIA DRIVE

The support package uses an SSH connection over TCP/IP to execute commands while building and running the generated CUDA code on the DRIVE platforms. Connect the target platform to the same network as the host computer or use an Ethernet crossover cable to connect the board directly to the host computer. For information on how to set up and configure your board, see NVIDIA documentation.

Create Drive Object

To communicate with the NVIDIA hardware, create a live hardware connection object by using the drive function

hwobj = drive('drive-board-name','ubuntu','ubuntu');

When connecting to the target board for the first time,you must provide the host name or IP address, user name, and password of the target board. On subsequent connections, you do not need to supply the address, user name, and password. The hardware object reuses these settings from the most recent successful connection to an NVIDIA board.

During the hardware live object creation, the support package performs hardware and software checks, installs MATLAB IO server on the target board, and gathers information on peripheral devices connected to the target. This information is displayed in the Command Window. In case of a connection failure, a diagnostics error message is reported at the MATLAB command line. If the connection has failed, the most likely cause is incorrect IP address or host name.

Verify GPU Environment on Target Board

To verify that the compilers and libraries necessary for running this example are set up correctly, use the coder.checkGpuInstall (GPU Coder) function.

envCfg = coder.gpuEnvConfig('drive');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
envCfg.HardwareObject = hwobj;
coder.checkGpuInstall(envCfg);

Get Pretrained SegNet DAG Network Object

net = getSegNet();

Downloading pre-trained SegNet (107 MB)...

The DAG network contains 91 layers including convolution, batch normalization, pooling, unpooling, and the pixel classification output layers. To see all the layers of the network, use the analyzeNetwork function.

Generate CUDA Code for the Target Board Using GPU Coder

This example uses segnet_predict.m file as the entry-point function for code generation. To generate a CUDA executable that you can deploy on to an NVIDIA target, create a GPU code configuration object for generating an executable.

cfg = coder.gpuConfig('exe');

When there are multiple live connection objects for different targets, the code generator performs a remote build on the target board for which a recent live object was created. To choose a hardware board for performing a remote build, use the setupCodegenContext() method of the respective live hardware object. If only one live connection object was created, you do not need to call this method.

hwobj.setupCodegenContext;

To create a configuration object for the DRIVE platform and assign it to the Hardware property of the code configuration object cfg, use the coder.hardware function.

cfg.Hardware = coder.hardware('NVIDIA Drive');

To specify the folder for performing remote build process on the target board, use the BuildDir property. If the specified build folder does not exist on the target board, then the software creates a folder with the given name. If no value is assigned to cfg.Hardware.BuildDir, the remote build process occurs in the last specified build folder. If there is no stored build folder value, the build process takes place in the home folder.

cfg.Hardware.BuildDir = '~/remoteBuildDir';

On NVIDIA platforms such as DRIVE PX2 that contain multiple GPUs, use the SelectCudaDevice property in the GPU configuration object to select a specific GPU.

cfg.GpuConfig.SelectCudaDevice = 0;

The custom main.cu file is a wrapper that calls the predict function in the generated code. Postprocessing steps are added in the main file by using OpenCV interfaces. The output of SegNet prediction is an 11-channel image. The eleven channels here represent the prediction scores of eleven different classes. In postprocessing, each pixel is assigned a class label that has the maximum score among the 11 channels. Each class is associated with a unique color for visualization. The final output is shown by using the OpenCV imshow function.

cfg.CustomSource  = fullfile('main.cu');

In this example, code generation uses an image as the input to the network. However, the custom main file is coded to take video as input and perform a SegNet prediction for each frame in the video sequence. The compiler and linker flags required to build the executable with OpenCV library are updated in the buildinfo section in the |segnet_predict.m|file.

Generate sample image input for code generation.

img = imread('peppers.png');
img = imresize(img,[360 480]);

To generate CUDA code, use the codegen function and pass the GPU code configuration and the size of the inputs for and segnet_predict.m entry-point function. After the code generation takes place on the host, the generated files are copied over and built on the target board.

codegen('-config ', cfg, 'segnet_predict', '-args', {img},'-report');

Run Executable on Target Board

Copy the input test video to the target workspace folder, using the workspaceDir property of the hardware object. This property contains the path to the codegen folder on the target board.

hwobj.putFile('CamVid.avi', hwobj.workspaceDir);

To launch the executable on the target hardware, use the runApplication() method of the hardware object.

hwobj.runApplication('segnet_predict','CamVid.avi');

The segmented image output is displayed in a window on the monitor that is connected to the target board.

You can stop the running executable on the target board from the MATLAB environment on the host by using the killApplication() method of the hardware object. This method uses the name of the application and not the name of the executable.

hwobj.killApplication('segnet_predict');