CUDA with NITROS

Overview

CUDA is a parallel computing programming model and helps robotic applications to implement functions that would otherwise be too slow on a CPU.

CUDA implemented in a ROS 2 node can take advantage of NITROS, the Isaac ROS implementation of type adaptation & type negotiation which enables accelerated computing in ROS 2.

https://media.githubusercontent.com/media/NVIDIA-ISAAC-ROS/.github/main/resources/isaac_ros_docs/concepts/nitros/system_diagram.png/

By using the NITROS publisher, CUDA code in a ROS node can share its output in GPU accelerated memory to NITROS enabled Isaac ROS nodes. This improves performance by avoiding CPU memory copies, and increasing the parallel compute between the GPU compute, and CPU processing. NITROS maintains compatibility with other ROS nodes subscribing to the topic such as RViz.

By using the NITROS subscriber, CUDA code in a ROS node can receive its input in GPU accelerated memory from a NITROS enabled Isaac ROS node, or another CUDA enabled ROS node publishing by using a NITROS subscriber. This has the same benefits of reducing CPU memory copies and increasing parallel computing between the GPU and CPU.

https://media.githubusercontent.com/media/NVIDIA-ISAAC-ROS/.github/main/resources/isaac_ros_docs/concepts/nitros/subscriber_publisher_diagram.png/

NITROS publisher and NITROS subscriber enable many design patterns compatible with Isaac ROS accelerate computing nodes, and traditional CPU nodes. For example publishing from a CUDA node directly to an Isaac ROS node, or visa-versa from an Isaac ROS node to a CUDA node. A CUDA node can be inserted between two Isaac ROS nodes, or even use multiple CUDA nodes in succession, all with the benefit of CUDA accelerated computing in each node. This creates more modular software designs, as CUDA accelerated functions can be packaged in individual nodes, instead of packing together into a larger single node.

Core Concepts

Managed NITROS Publisher

The Managed NITROS Publisher provides a simple and familiar interface for publishing messages into NITROS-enabled graphs. The API is comparable to the standard rclcpp::Publisher API, making it easy to add a Managed NITROS Publisher to an existing ROS 2 node.

Managed NITROS Publishers are specifically designed to publish NITROS-typed messages. These NITROS types are listed under the isaac_ros_nitros_types subfolder.

Note

Currently, the Managed NITROS Publisher is only compatible with the isaac_ros_nitros_tensor_list_type and isaac_ros_nitros_image_type. These types enable you to send and receive tensors and images to and from packages in Isaac ROS DNN Inference.

Managed NITROS Subscriber

The Managed NITROS Subscriber is analogous to the Managed NITROS Publisher, offering a straightforward, rclcpp::Subscriber-like interface for subscribing to messages from NITROS-enabled graphs.

Managed NITROS Subscribers are specifically designed to receive NITROS-typed messages. These NITROS types are listed under the isaac_ros_nitros_types subfolder.

Note

Currently, the Managed NITROS Subscriber is only compatible with the isaac_ros_nitros_tensor_list_type and isaac_ros_nitros_image_type.

NITROS Builders

NITROS Builders are a series of utility classes that streamline the process of creating a NITROS-typed message. These classes offer a builder-style interface, allowing developers to specify relevant fields for object construction one at a time:

using namespace nvidia::isaac_ros::nitros;
NitrosTensorList tensor_list = NitrosTensorListBuilder()
    .WithHeader(ros_header)
    .AddTensor("tensor_1", foo_tensor)
    .AddTensor("tensor_2", bar_tensor)
    .Build();

The collection of NITROS Builders also includes additional builders that construct individual components of the broader NITROS types. For example, a NitrosTensorBuilder can be used to produce a NitrosTensor that is subsequently consumed by a NitrosTensorListBuilder.

NITROS Views

NITROS Views are a series of utility classes that simplify the process of accessing fields of a NITROS-typed message. These classes offer an analogous interface to the NITROS Builder, allowing developers to retrieve specific portions of a composite structure and process them accordingly.

Standard Usage

The standard pattern for using Managed NITROS involves 3 key steps:

  1. Encoding arbitrary data into an appropriate NITROS-type message using a Custom NITROS encoder node

  2. Processing the NITROS-type message using NITROS-enabled ROS 2 nodes

  3. Decoding an output NITROS-type message into an arbitrary format using a Custom NITROS decoder node

For example, a graph for performing DNN-based segmentation on pointclouds might involve:

  1. A Custom NITROS encoder node that converts a sensor_msgs/PointCloud2 into a NitrosTensorList

  2. The Isaac ROS TensorRT node that performs DNN inference on the TensorRT backend, taking in an input NitrosTensorList and producing an output NitrosTensorList

  3. A Custom NITROS decoder node that converts the output NitrosTensorList into a segmented sensor_msgs/PointCloud2

Examples

Custom NITROS String Encoder and Decoder

The isaac_ros_managed_nitros_examples/custom_nitros_string package contains a minimal example that demonstrates how a pair of Custom NITROS encoder and decoder nodes can leverage Managed NITROS utilities.

The included launch test uses a minimal graph of just two nodes:

  1. StringEncoderNode that encodes a std_msgs/String message into a NitrosTensorList

  2. StringDecoderNode that decodes a NitrosTensorList back into a std_msgs/String

Note

Since there is no NITROS type designed specifically for strings, this examples uses the NitrosTensorList type to encode string data. The flexibility of the NitrosTensorList format supports the encoding of arbitrary data.

The StringEncoderNode’s implementation demonstrates usage of the NITROS Builder utilities. The StringEncoderNode::InputCallback function begins by allocating a CUDA buffer to store the received std_msgs/String’s data. In this minimal example, the character bytes are copied over to the CUDA buffer without any need for additional encoding. Then, a NitrosTensorBuilder is used to build a NitrosTensor, which is then added to a NitrosTensorListBuilder with the name "input_tensor". Finally, the NitrosTensorListBuilder::Build method produces the output NitrosTensorList that is published via the StringEncoderNode’s Managed NITROS publisher.

The StringDecoderNode’s implementation demonstrates usage of the NITROS View utilities. The StringDecoderNode::InputCallback function begins by resizing a std::string buffer to store the data that will eventually be extracted from the NitrosTensorList. The Managed NITROS Subscriber receives a NitrosTensorListView, through which it extracts the NitrosTensorView corresponding to "input_tensor" using the NitrosTensorListView::GetNamedTensor method. Since the data in NitrosTensorView’s CUDA buffer was originally encoded in a trivial way, the decoding is similarly straightforward: the character bytes are copied over without any need for additional decoding. Finally, the std::string is copied to a std_msgs/String message and published with a standard publisher.

Custom NITROS Image Builder and Viewer

The isaac_ros_managed_nitros_examples/custom_nitros_image package contains an intermediate example of Managed NITROS functionality for passing images.

The included launch test uses a minimal graph of just two nodes:

1. GpuImageBuilderNode that encodes a sensor_msgs/Image message into a NitrosImage 2. GpuImageViewerNode that decodes a NitrosImage back into a sensor_msgs/Image

The GpuImageBuilderNode’s implementation demonstrates usage of the NITROS Builder utilities. The GpuImageBuilderNode::InputCallback function begins by allocating a CUDA buffer to store the received sensor_msgs/Image’s data. In this minimal example, the bytes are copied over to the CUDA buffer without any interleaving or other processing. Then, a NitrosImageBuilder is used to build a NitrosImage from the ROS header, appropriate sensor_msgs-standard image encoding, image dimensions, and CUDA buffer from the previous step. The NitrosImageBuilder::Build method produces the output NitrosImage that is published via the GpuImageBuilderNode’s Managed NITROS publisher.

The GpuImageViewerNode’s implementation demonstrates usage of the NITROS View utilities. The GpuImageViewerNode::InputCallback function begins by creating a sensor_msgs/Image message and populating its basic fields through the NitrosImageView object’s straightforward getters. Next, the output image’s data vector is resized to ensure it will appropriately fit the actual image data. Finally, the image’s data bytes are copied over without any additional processing, and the output message is published with a standard publisher.

Custom NITROS DNN Image Encoder

The isaac_ros_managed_nitros_examples/custom_nitros_dnn_image_encoder package contains an advanced example of Managed NITROS functionality for encoding images as tensors for DNN inference.

Note

This example is currently limited to x86_64 platforms.

Note

To build and run this example, complete the following steps:

  1. Delete the COLCON_IGNORE file in this example’s package

  2. Add the cvcuda suffix to the Isaac ROS Dev base image key used by run_dev.sh, as explained in this guide

As shown in the included launch file, this package’s ImageEncoderNode has identical functionality to the standard Isaac ROS DNN Image Encoder Node.

This example leverages NVIDIA’s CV-CUDA library to perform the core image normalization operations. CUDA with NITROS unlocks the full value of libraries like CV-CUDA for high performance robotics applications.