Getting Started with the GPU Coder Support Package for NVIDIA GPUs

This example shows how to use the GPU Coder™ Support Package for NVIDIA GPUs and connect to NVIDIA® DRIVE™ and Jetson hardware platforms, perform basic operations, generate CUDA® executable from a MATLAB® function and run the executable on the hardware. A simple vector addition example is used to demonstrate this concept.


Target Board Requirements

  • NVIDIA DRIVE PX2 or Jetson TX1/TX2 embedded platform.

  • Ethernet crossover cable to connect the target board and host PC (if the target board cannot be connected to a local network).

  • NVIDIA CUDA toolkit installed on the board.

  • Environment variables on the target for the compilers and libraries. For information on the supported versions of the compilers and libraries and their setup, see installing and setting up prerequisites for NVIDIA boards.

Development Host Requirements

  • GPU Coder™ for code generation. For an overview and tutorials, visit the GPU Coder product page .

  • NVIDIA CUDA toolkit on the host.

  • Environment variables on the host for the compilers and libraries. For information on the supported versions of the compilers and libraries, see Third-party Products. For setting up the environment variables, see Environment Variables.

Create a Folder and Copy Relevant Files

The following line of code creates a folder in your current working directory (host), and copies all the relevant files into this folder.if you cannot generate files in this folder, change your current working directory before running this command.


Connect to the NVIDIA Hardware

The GPU Coder Support Package for NVIDIA GPUs uses an SSH connection over TCP/IP to execute commands while building and running the generated CUDA code on the DRIVE or Jetson platforms. You must therefore connect the target platform to the same network as the host computer or use an Ethernet crossover cable to connect the board directly to the host computer. Refer to the NVIDIA documentation on how to set up and configure your board.

To communicate with the NVIDIA hardware, you must create a live hardware connection object by using the drive or jetson function. You must know the host name or IP address, username, and password of the target board to create a live hardware connection object. For example, use the following command to create live object for Jetson hardware,

hwobj = jetson('jetson-tx2-name','ubuntu','ubuntu');

During the hardware live object creation checking of hardware, IO server installation and gathering peripheral information on target will be performed. This information is displayed in the command window.

Similarly, use the following command to create live object for DRIVE hardware,

hwobj = drive('drive-px2-name','ubuntu','ubuntu');


In case of a connection failure, a diagnostics error message is reported on the MATLAB command line. If the connection has failed, the most likely cause is incorrect IP address or hostname.

When a successful connection to the board is established, you can use the system method of the board object to execute various Linux shell commands on the NVIDIA hardware from MATLAB. For example, to list the contents of the home directory on the target:

system(hwobj, 'ls -al ~');

The hardware object provides basic file manipulation capabilities. To transfer files from from the host to the target use the putFile() method of the live hardware object. For example, the following command transfers the file test.txt in the current directory to the remoteBuildDir on the target.

hwobj.putFile('test.txt', '~/remoteBuildDir');

And to copy a file from the target to host computer, use the getFile() method of the hardware object. For example, the following command transfers the file test.txt in the remoteBuildDir directory on target to the current directory on the host.


Verify the GPU Environment

Use the coder.checkGpuInstall function and verify that the compilers and libraries needed for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('jetson'); % Use 'drive' for NVIDIA DRIVE hardware
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
envCfg.HardwareObject = hwobj;

Generate CUDA Code for the Target Using GPU Coder

This example uses myAdd.m, a simple vector addition as the entry-point function for code generation. To generate a CUDA executable that can deployed on to a NVIDIA target, create a GPU code configuration object for generating an executable.

cfg = coder.gpuConfig('exe');

When there are multiple live connection objects for different targets, the code generator performs remote build on the target for which a recent live object was created. To choose a hardware board for performing remote build, use the setupCodegenContext() method of the respective live hardware object. If only one live connection object was created, it is not necessary to call this method.


Use the coder.hardware function to create a configuration object for the DRIVE or Jetson platform and assign it to the Hardware property of the code configuration object cfg. Use 'NVIDIA Jetson' for the Jetson TX1 or TX2 boards and 'NVIDIA Drive' for the DRIVE board.

cfg.Hardware = coder.hardware('NVIDIA Jetson');

Use the BuildDir property to specify the directory for performing remote build process on the target. If the specified build directory does not exist on the target then the software creates a directory with the given name. If no value is assigned to cfg.Hardware.BuildDir, the remote build process happens in the last specified build directory. If there is no stored build directory value, the build process takes place in the home directory.

cfg.Hardware.BuildDir = '~/remoteBuildDir';

Certain NVIDIA platforms such as DRIVE PX2 contain multiple GPUs. On such platforms, use the SelectCudaDevice property in the GPU configuration object to select a specific GPU.

cfg.GpuConfig.SelectCudaDevice = 0;

The custom main file is a wrapper that calls the entry point function in the generated code. The main file passes a vector containing the first 100 natural numbers to the entry point. It writes the results to the 'myAdd.bin' binary file.

cfg.CustomSource  = fullfile('');

To generate CUDA code, use the codegen function and pass the GPU code configuration along with the size of the inputs for and myAdd entry-point function. After the code generation takes place on the host, the generated files are copied over and built on the target.

codegen('-config ',cfg,'myAdd','-args',{1:100,1:100});

Run the Executable on the Target

Use the runApplication() method of the hardware object to launch the exectuable on the target hardware.

pid = hwobj.runApplication('myAdd');

Alternatively, you can use the runExecutable() method of the hardware object to run the executable.

exe = [hwobj.workspaceDir '/myAdd.elf'];
pid = hwobj.runExecutable(exe);

Verify the Result from Target

Copy the output bin file myAdd.bin to the MATLAB environment on the host and compare the computed results with those from MATLAB. The property workspaceDir contains the path to the codegen folder on the target.

pause(0.3); % To ensure that the executable completed the execution.
hwobj.getFile([hwobj.workspaceDir '/myAdd.bin']);

Simulation result from the MATLAB.

simOut = myAdd(0:99,0:99);

Read the copied result binary file from target in MATLAB.

fId  = fopen('myAdd.bin','r');
tOut = fread(fId,'double');

Find the difference between MATLAB simulation output and GPU coder output from target.

diff = simOut - tOut';

Display the maximum deviation between the simulation output and GPU coder output from target.

fprintf('Maximum deviation between MATLAB Simulation output and GPU coder output on Target is: %f\n', max(diff(:)));


Remove the files and return to the original folder.