Preparing Deep Learning Models for Isaac ROS
Obtaining a Pre-trained Model from NGC
The NVIDIA GPU Cloud hosts a catalog of Deep Learning pre-trained models that are available for your development.
Use the Search Bar to find a pre-trained model that you are interested in working with.
Click on the model’s card to view an expanded description, and then click on the File Browser tab along the navigation bar.
Using the File Browser, find a deployable
.etlt
file for the model you are interested in.Note: The
.etlt
file extension indicates that this model has pre-trained but encrypted weights, which means one needs to use thetao-converter
utility to decrypt the model, as described below.Under the Actions heading, click on the … icon for the file you selected in the previous step, and then click Copy
wget
command.Paste the copied command into a terminal to download the model in the current working directory.
Using tao-converter
to decrypt the Encrypted TLT Model (.etlt
) Format
As discussed above, models distributed with the .etlt
file extension
are encrypted and must be decrypted before use via NVIDIA’s
tao-converter.
tao-converter
is already included in the Docker images available as
part of the standard Isaac ROS Development Environment.
The per-platform installation paths are described below:
Platform |
Installation Path |
Symlink Path |
---|---|---|
|
|
|
Jetson ( |
|
|
Converting .etlt
to a TensorRT Engine Plan
Here are some examples for generating the TensorRT engine file using
tao-converter
. In this example, we will use the
PeopleSemSegnet Shuffleseg model:
Generate an engine file for the fp16
data type
mkdir -p /workspaces/isaac_ros-dev/models && \
/opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t fp16 -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt
Note
The specific values used in the command above are retrieved from the PeopleSemSegnet
page under the Overview tab.The
model input node name and output node name can be found in
peoplesemsegnet_shuffleseg_cache.txt
from File Browser
. The
output file is specified using the -e
option. The tool needs
write permission to the output directory.
A detailed explanation of the input parameters is available here.
Generate an engine file for the data type int8
Create the models directory:
mkdir -p /workspaces/isaac_ros-dev/models
Note
Check the model’s page on NGC for the latest wget
command.
wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt
/opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt
Note
The calibration cache file (specified using the -c
option) is required to generate the int8
engine file. This file
is provided in the File Browser tab of the model’s page on NGC.
Using trtexec
to convert an ONNX model to a TensorRT Plan File
Assuming that a model called model.onnx
is available, the conversion is performed using:
/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.plan
Warning
Reading the documentation of trtexec is highly recommended to
obtain best performance. In particular, we recommend pay attention to the quantization of the model (e.g. fp32
vs fp16
vs int8
).
Inspecting The Input and Output Binding Names of a Model
Deep learning models have input_binding_names
and output_binding_names
. These correspond to the model’s inputs and outputs
respectively. These are determined by the model itself during export. There are two methods one can perform to determine this, but
the recommended way is using a TensorRT Plan File.
Note
In addition, the TensorRTNode
and TritonNode
have parameters called input_tensor_names
and output_tensor_names
,
these correspond to the expected tensor names within the ROS 2 TensorList
.
Using an ONNX Model File
If an ONNX Model file is used, one can use netron to visualize the ONNX model, and note down the input and output names and dimensions.
Using a TensorRT Plan File
If a TensorRT Plan file is used, one can use NVIDIA’s polygraph tool to determine it.
Install TensorRT’s Python binding and the polygraph tool:
pip install tensorrt tensorrt_bindings pip install colored polygraphy --extra-index-url https://pypi.ngc.nvidia.com
Add
/home/admin/.local/bin
to yourPATH
to usepolygraph
more conveniently:export PATH="/home/admin/.local/bin:$PATH"
Obtain the desired model. In this case, we’ll show how to get the
PeopleSemSegnet ShuffleSeg
network:mkdir -p /tmp/models/peoplesemsegnet_shuffleseg/1 && \ cd /tmp/models/peoplesemsegnet_shuffleseg && \ wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_etlt.etlt && \ wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt
Convert the obtained model from an
etlt
file to aplan
file (calledmodel.plan
):/opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt
Now go to the directory where we obtained the
PeopleSemSegNet ShuffleSeg
model:cd /tmp/models/peoplesemsegnet_shuffleseg/1
Now use
polygraph
to inspect the names of the inputs and outputs of the model. In this case, the model we obtained is calledmodel.plan
:polygraphy inspect model model.plan
The expected output should look like this:
[I] Loading bytes from /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan [I] ==== TensorRT Engine ==== Name: Unnamed Network 0 | Explicit Batch Engine ---- 1 Engine Input(s) ---- {input_2:0 [dtype=float32, shape=(1, 3, 544, 960)]} ---- 1 Engine Output(s) ---- {argmax_1 [dtype=int32, shape=(1, 544, 960, 1)]} ---- Memory ---- Device Memory: 21269504 bytes ---- 1 Profile(s) (2 Tensor(s) Each) ---- - Profile: 0 Tensor: input_2:0 (Input), Index: 0 | Shapes: min=(1, 3, 544, 960), opt=(1, 3, 544, 960), max=(1, 3, 544, 960) Tensor: argmax_1 (Output), Index: 1 | Shape: (1, 544, 960, 1) ---- 73 Layer(s) ----
In this case, the
input_binding_names
for this network is['input_2:0']
, whereas theoutput_binding_names
is['argmax_1']
. The shape of each dimension can also be observed from this command.These values can be taken and used as the
input_binding_names
andoutput_binding_names
for theTensorRTNode
orTritonNode
. If a model has multiple inputs or outputs, these must be passed in as a string list of all the values. Once again, ensure that theTensorRTNode
orTritonNode
’sinput_tensor_names
andoutput_tensor_names
parameters are correctly set according to the names of the ROS 2TensorList
message obtained from any upstream nodes or expected by any downstream nodes.