GPU Coder

Generate CUDA code for NVIDIA GPUs

GPU Coder generates optimized CUDA^® code from MATLAB code and Simulink models. The generated code includes CUDA kernels for parallelizable parts of your deep learning, embedded vision, and radar and signal processing algorithms. You can profile the generated CUDA to identify bottlenecks and opportunities for performance optimization. For high performance, the generated code can call NVIDIA^® TensorRT^®. You can integrate the generated CUDA into your project as source code or static/dynamic libraries and compile it for modern NVIDIA GPUs, including those embedded on NVIDIA Jetson™, NVIDIA DRIVE™, and NVIDIA Clara™ platforms. You can access peripherals on the Jetson and DRIVE platforms and incorporate manually written CUDA into the generated code.

GPU Coder provides bidirectional links that let you trace between MATLAB code and generated CUDA (with Embedded Coder^®). You can verify the numerical behavior of the generated code via software-in-the-loop (SIL) and processor-in-the-loop (PIL) testing.

Code generation report showing generated CUDA code.

Generate CUDA Code from MATLAB

Compile and run CUDA code generated from your MATLAB algorithms on popular NVIDIA GPUs, from desktop RTX cards to data centers to embedded Jetson and DRIVE platforms. Deploy the generated code royalty-free to your customers at no charge.

Generate CUDA Code for a Fog Rectification Algorithm (2:22)

Documentation | Examples

Simulink model of a lane and vehicle detector.

Generate CUDA Code from Simulink

Use Simulink Coder with GPU Coder to generate CUDA code from your Simulink models and deploy it to NVIDIA GPUs. Accelerate compute-intensive portions of Simulink simulations on NVIDIA GPUs.

Deep Learning in Simulink for NVIDIA GPUs: Generate CUDA Code Using GPU Coder (3:29)

Documentation | Examples

Deploy to NVIDIA Jetson and DRIVE

GPU Coder automates deployment of generated code onto NVIDIA Jetson and DRIVE platforms. Access peripherals, acquire sensor data, and deploy your algorithm along with peripheral interface code to the board for standalone execution.

Using GPU Coder to Prototype and Deploy on NVIDIA Drive, Jetson (2:54)

Documentation | Examples

Two camera views of road traffic as part of a vehicle and lane detection application in Simulink.

Generate Code for Deep Learning

Deploy a variety of predefined or customized deep learning networks to NVIDIA GPUs. Generate code for preprocessing and postprocessing along with your trained deep learning networks to deploy complete algorithms.

Deep Learning in Simulink for NVIDIA GPUs: Classification of ECG Signals (7:35)

Documentation | Examples

Bar chart titled “Inference with ResNet-50” showing images/second increasing with the use of FP32 and INT8 data types.

Optimize Generated Code

GPU Coder automatically applies optimizations including memory management, kernel fusion, and auto-tuning. Reduce memory footprint by generating INT8 or bfloat16 code. Further boost performance by integrating with TensorRT.

Pedestrian Detection on a NVIDIA GPU with TensorRT (1:34)

Documentation | Examples

A report from the GPU Coder Performance Analyzer tool showing profiling information on the generated code.

Profile and Analyze Generated Code

Use the GPU Coder Performance Analyzer to profile generated CUDA code and identify opportunities to further improve execution speed and memory footprint.

Documentation | Examples

Diagram showing how the stencil processing design pattern works at a conceptual level.

Use Design Patterns to Boost Performance

Design patterns, including stencil processing and reductions, are applied automatically when available to increase the performance of generated code. You can also manually invoke them using specific pragmas.

Documentation | Examples

Code generation report showing interactive bidirectional traceability between MATLAB code and generated CUDA code.

Log Signals, Tune Parameters, and Verify Code Behavior

Use GPU Coder with Simulink Coder to log signals and tune parameters in real time. Add Embedded Coder to interactively trace between MATLAB and generated CUDA code to numerically verify the behavior of generated CUDA code via SIL testing.

Trace Between Generated CUDA Code and MATLAB Source Code

Documentation | Examples

Simulink model of an ECG prediction algorithm with GPU Coder and NVIDIA GPUs used to accelerate.

Accelerate MATLAB and Simulink Simulations

Call generated CUDA code as a MEX function from your MATLAB code to speed execution. Use Simulink Coder with GPU Coder to accelerate compute-intensive portions of MATLAB Function blocks in your Simulink models on NVIDIA GPUs.

Accelerate Radar Simulations on NVIDIA GPUs Using GPU Coder (3:24)

Documentation | Examples

Product Resources:

Documentation Examples Videos Technical articles Functions Hardware support Prerequisite Products Requirements Release notes

Drass Develops Deep Learning System for Real-Time Object Detection in Maritime Environments

“From data annotation to choosing, training, testing, and fine-tuning our deep learning model, MATLAB had all the tools we needed—and GPU Coder enabled us to rapidly deploy to our NVIDIA GPUs even though we had limited GPU experience.”

View more customer stories