# Lane Detection Optimized with GPU Coder

This example shows how to develop a deep learning lane detection application that runs on NVIDIA® GPUs.

The pretrained lane detection network can detect and output lane marker boundaries from an image and is based on the `AlexNet` network. The last few layers of the `AlexNet` network are replaced by a smaller fully connected layer and regression output layer. The example generates a CUDA executable that runs on a CUDA-enabled GPU on the host machine.

### Prerequisites

• CUDA enabled NVIDIA GPU.

• NVIDIA CUDA toolkit and driver.

• NVIDIA cuDNN library.

• Environment variables for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-Party Hardware. For setting up the environment variables, see Setting Up the Prerequisite Products.

### Verify GPU Environment

Use the `coder.checkGpuInstall` function to verify that the compilers and libraries necessary for running this example are set up correctly.

```envCfg = coder.gpuEnvConfig('host'); envCfg.DeepLibTarget = 'cudnn'; envCfg.DeepCodegen = 1; envCfg.Quiet = 1; coder.checkGpuInstall(envCfg);```

### Get Pretrained Lane Detection Network

This example uses the `trainedLaneNet` MAT-file containing the pretrained lane detection network. This file is approximately 143 MB size. Download the file from the MathWorks website.

```laneNetFile = matlab.internal.examples.downloadSupportFile('gpucoder/cnn_models/lane_detection', ... 'trainedLaneNet.mat');```

This network takes an image as an input and outputs two lane boundaries that correspond to the left and right lanes of the ego vehicle. Each lane boundary is represented by the parabolic equation: $y=a{x}^{2}+bx+c$, where y is the lateral offset and x is the longitudinal distance from the vehicle. The network outputs the three parameters a, b, and c per lane. The network architecture is similar to `AlexNet` except that the last few layers are replaced by a smaller fully connected layer and regression output layer.

```load(laneNetFile); disp(laneNet)```
``` SeriesNetwork with properties: Layers: [23×1 nnet.cnn.layer.Layer] InputNames: {'data'} OutputNames: {'output'} ```

To view the network architecture, use the `analyzeNetwork` function.

```analyzeNetwork(laneNet) ```

To test the model, the example uses the a video file from the Caltech lanes dataset. The file is approximately 8 MB in size. Download the file from the MathWorks website.

`videoFile = matlab.internal.examples.downloadSupportFile('gpucoder/media','caltech_cordova1.avi');`

### Main Entry-Point Function

The `detectLanesInVideo.m` file is the main entry-point function for code generation. The `detectLanesInVideo` function uses the `vision.VideoFileReader` (Computer Vision Toolbox) system object to read frames from the input video, calls the predict method of the LaneNet network object, and draws the detected lanes on the input video. A `vision.DeployableVideoPlayer` (Computer Vision Toolbox) system object is used to display the lane detected video output.

`type detectLanesInVideo.m`
```function detectLanesInVideo(videoFile,net,laneCoeffMeans,laneCoeffsStds) % detectLanesInVideo Entry-point function for the Lane Detection Optimized % with GPU Coder example % % detectLanesInVideo(videoFile,net,laneCoeffMeans,laneCoeffsStds) uses the % VideoFileReader system object to read frames from the input video, calls % the predict method of the LaneNet network object, and draws the detected % lanes on the input video. A DeployableVideoPlayer system object is used % to display the lane detected video output. % Copyright 2022 The MathWorks, Inc. %#codegen %% Create Video Reader and Video Player Object videoFReader = vision.VideoFileReader(videoFile); depVideoPlayer = vision.DeployableVideoPlayer(Name='Lane Detection on GPU'); %% Video Frame Processing Loop while ~isDone(videoFReader) videoFrame = videoFReader(); scaledFrame = 255.*(imresize(videoFrame,[227 227])); [laneFound,ltPts,rtPts] = laneNetPredict(net,scaledFrame, ... laneCoeffMeans,laneCoeffsStds); if(laneFound) pts = [reshape(ltPts',1,[]);reshape(rtPts',1,[])]; videoFrame = insertShape(videoFrame, 'Line', pts, 'LineWidth', 4); end depVideoPlayer(videoFrame); end end ```

### LaneNet Predict Function

The `laneNetPredict` function computes the right and left lane positions in a single video frame. The `laneNet` network computes parameters a, b, and c that describe the parabolic equation for the left and right lane boundaries. From these parameters, compute the x and y coordinates corresponding to the lane positions. The coordinates must be mapped to image coordinates.

`type laneNetPredict.m`
```function [laneFound,ltPts,rtPts] = laneNetPredict(net,frame,means,stds) % laneNetPredict Predict lane markers on the input image frame using the % lane detection network % % Copyright 2017-2022 The MathWorks, Inc. %#codegen % A persistent object lanenet is used to load the network object. At the % first call to this function, the persistent object is constructed and % setup. When the function is called subsequent times, the same object is % reused to call predict on inputs, thus avoiding reconstructing and % reloading the network object. persistent lanenet; if isempty(lanenet) lanenet = coder.loadDeepLearningNetwork(net, 'lanenet'); end lanecoeffsNetworkOutput = predict(lanenet,frame); % Recover original coeffs by reversing the normalization steps. params = lanecoeffsNetworkOutput .* stds + means; % 'c' should be more than 0.5 for it to be a lane. isRightLaneFound = abs(params(6)) > 0.5; isLeftLaneFound = abs(params(3)) > 0.5; % From the networks output, compute left and right lane points in the image % coordinates. vehicleXPoints = 3:30; ltPts = coder.nullcopy(zeros(28,2,'single')); rtPts = coder.nullcopy(zeros(28,2,'single')); if isRightLaneFound && isLeftLaneFound rtBoundary = params(4:6); rt_y = computeBoundaryModel(rtBoundary, vehicleXPoints); ltBoundary = params(1:3); lt_y = computeBoundaryModel(ltBoundary, vehicleXPoints); % Visualize lane boundaries of the ego vehicle. tform = get_tformToImage; % Map vehicle to image coordinates. ltPts = tform.transformPointsInverse([vehicleXPoints', lt_y']); rtPts = tform.transformPointsInverse([vehicleXPoints', rt_y']); laneFound = true; else laneFound = false; end end %% Helper Functions % Compute boundary model. function yWorld = computeBoundaryModel(model, xWorld) yWorld = polyval(model, xWorld); end % Compute extrinsics. function tform = get_tformToImage %The camera coordinates are described by the caltech mono % camera model. yaw = 0; pitch = 14; % Pitch of the camera in degrees roll = 0; translation = translationVector(yaw, pitch, roll); rotation = rotationMatrix(yaw, pitch, roll); % Construct a camera matrix. focalLength = [309.4362, 344.2161]; principalPoint = [318.9034, 257.5352]; Skew = 0; camMatrix = [rotation; translation] * intrinsicMatrix(focalLength, ... Skew, principalPoint); % Turn camMatrix into 2-D homography. tform2D = [camMatrix(1,:); camMatrix(2,:); camMatrix(4,:)]; % drop Z tform = projective2d(tform2D); tform = tform.invert(); end % Translate to image co-ordinates. function translation = translationVector(yaw, pitch, roll) SensorLocation = [0 0]; Height = 2.1798; % mounting height in meters from the ground rotationMatrix = (... rotZ(yaw)*... % last rotation rotX(90-pitch)*... rotZ(roll)... % first rotation ); % Adjust for the SensorLocation by adding a translation. sl = SensorLocation; translationInWorldUnits = [sl(2), sl(1), Height]; translation = translationInWorldUnits*rotationMatrix; end % Rotation around X-axis. function R = rotX(a) a = deg2rad(a); R = [... 1 0 0; 0 cos(a) -sin(a); 0 sin(a) cos(a)]; end % Rotation around Y-axis. function R = rotY(a) a = deg2rad(a); R = [... cos(a) 0 sin(a); 0 1 0; -sin(a) 0 cos(a)]; end % Rotation around Z-axis. function R = rotZ(a) a = deg2rad(a); R = [... cos(a) -sin(a) 0; sin(a) cos(a) 0; 0 0 1]; end % Given the Yaw, Pitch, and Roll, determine the appropriate Euler angles % and the sequence in which they are applied to align the camera's % coordinate system with the vehicle coordinate system. The resulting % matrix is a Rotation matrix that together with the Translation vector % defines the extrinsic parameters of the camera. function rotation = rotationMatrix(yaw, pitch, roll) rotation = (... rotY(180)*... % last rotation: point Z up rotZ(-90)*... % X-Y swap rotZ(yaw)*... % point the camera forward rotX(90-pitch)*... % "un-pitch" rotZ(roll)... % 1st rotation: "un-roll" ); end % Intrinsic matrix computation. function intrinsicMat = intrinsicMatrix(FocalLength, Skew, PrincipalPoint) intrinsicMat = ... [FocalLength(1) , 0 , 0; ... Skew , FocalLength(2) , 0; ... PrincipalPoint(1), PrincipalPoint(2), 1]; end ```

### Generate CUDA Executable

To generate a standalone CUDA executable for the `detectLanesInVideo` entry-point function, create a GPU code configuration object for `'exe'` target and set the target language to C++. Use the `coder.DeepLearningConfig` function to create a `CuDNN` deep learning configuration object and assign it to the `DeepLearningConfig` property of the GPU code configuration object.

```cfg = coder.gpuConfig('exe'); cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn'); cfg.GenerateReport = true; cfg.GenerateExampleMain = "GenerateCodeAndCompile"; cfg.TargetLang = 'C++'; inputs = {coder.Constant(videoFile),coder.Constant(laneNetFile), ... coder.Constant(laneCoeffMeans),coder.Constant(laneCoeffsStds)};```

Run the `codegen` command.

`codegen -args inputs -config cfg detectLanesInVideo`
```Code generation successful: View report ```

### Generated Code Description

The series network is generated as a C++ class containing an array of 18 layer classes (after layer fusion optimization). The `setup()` method of the class sets up handles and allocates memory for each layer object. The `predict()` method invokes prediction for each of the 18 layers in the network.

```class lanenet0_0 { public: lanenet0_0(); void setSize(); void resetState(); void setup(); void predict(); void cleanup(); float *getLayerOutput(int layerIndex, int portIndex); int getLayerOutputSize(int layerIndex, int portIndex); float *getInputDataPointer(int b_index); float *getInputDataPointer(); float *getOutputDataPointer(int b_index); float *getOutputDataPointer(); int getBatchSize(); ~lanenet0_0(); private: void allocate(); void postsetup(); void deallocate(); public: boolean_T isInitialized; boolean_T matlabCodegenIsDeleted; private: int numLayers; MWTensorBase *inputTensors[1]; MWTensorBase *outputTensors[1]; MWCNNLayer *layers[18]; MWCudnnTarget::MWTargetNetworkImpl *targetImpl; }; ```

The cnn_lanenet*_conv*_w and cnn_lanenet*_conv*_b files are the binary weights and bias file for convolution layer in the network. The cnn_lanenet*_fc*_w and cnn_lanenet*_fc*_b files are the binary weights and bias file for fully connected layer in the network.

```codegendir = fullfile('codegen', 'exe', 'detectLanesInVideo'); dir([codegendir,filesep,'*.bin'])```
```cnn_lanenet0_0_conv1_b.bin cnn_lanenet0_0_conv3_b.bin cnn_lanenet0_0_conv5_b.bin cnn_lanenet0_0_fc6_b.bin cnn_lanenet0_0_fcLane2_b.bin cnn_lanenet0_0_conv1_w.bin cnn_lanenet0_0_conv3_w.bin cnn_lanenet0_0_conv5_w.bin cnn_lanenet0_0_fc6_w.bin cnn_lanenet0_0_fcLane2_w.bin cnn_lanenet0_0_conv2_b.bin cnn_lanenet0_0_conv4_b.bin cnn_lanenet0_0_data_offset.bin cnn_lanenet0_0_fcLane1_b.bin networkParamsInfo_lanenet0_0.bin cnn_lanenet0_0_conv2_w.bin cnn_lanenet0_0_conv4_w.bin cnn_lanenet0_0_data_scale.bin cnn_lanenet0_0_fcLane1_w.bin ```

### Run the Executable

To run the executable, uncomment the following lines of code.

```if ispc [status,cmdout] = system("detectLanesInVideo.exe"); else [status,cmdout] = system("./detectLanesInVideo"); end ```