============================================ Preparing Deep Learning Models for Isaac ROS ============================================ Obtaining a Pre-trained Model from NGC -------------------------------------- The NVIDIA GPU Cloud hosts a `catalog `__ of Deep Learning pre-trained models that are available for your development. 1. Use the **Search Bar** to find a pre-trained model that you are interested in working with. 2. Click on the model's card to view an expanded description, and then click on the **File Browser** tab along the navigation bar. 3. Using the **File Browser**, find a deployable ``.etlt`` file for the model you are interested in. **Note:** The ``.etlt`` file extension indicates that this model has pre-trained but **encrypted** weights, which means one needs to use the ``tao-converter`` utility to decrypt the model, as described :ref:`below `. 4. Under the **Actions** heading, click on the **…** icon for the file you selected in the previous step, and then click **Copy** ``wget`` **command**. 5. **Paste** the copied command into a terminal to download the model in the current working directory. .. _tao-converter: Using ``tao-converter`` to decrypt the Encrypted TLT Model (``.etlt``) Format ----------------------------------------------------------------------------- As discussed above, models distributed with the ``.etlt`` file extension are encrypted and must be decrypted before use via NVIDIA's `tao-converter `__. ``tao-converter`` is already included in the Docker images available as part of the standard :doc:`Isaac ROS Development Environment `. The per-platform installation paths are described below: ==================== =============================================================== ===================================== Platform Installation Path Symlink Path ==================== =============================================================== ===================================== ``x86_64`` ``/opt/nvidia/tao/tao-converter-x86-tensorrt8.0/tao-converter`` ``/opt/nvidia/tao/tao-converter`` Jetson (``aarch64``) ``/opt/nvidia/tao/jp5`` ``/opt/nvidia/tao/tao-converter`` ==================== =============================================================== ===================================== Converting ``.etlt`` to a TensorRT Engine Plan ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here are some examples for generating the TensorRT engine file using ``tao-converter``. In this example, we will use the `PeopleSemSegnet Shuffleseg model `__: Generate an engine file for the ``fp16`` data type ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code:: bash mkdir -p /workspaces/isaac_ros-dev/models && \ /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t fp16 -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt .. note:: The specific values used in the command above are retrieved from the **PeopleSemSegnet** page under the **Overview** tab.The model input node name and output node name can be found in ``peoplesemsegnet_shuffleseg_cache.txt`` from ``File Browser``. The output file is specified using the ``-e`` option. The tool needs write permission to the output directory. A detailed explanation of the input parameters is available `here `__. Generate an engine file for the data type ``int8`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Create the models directory: .. code:: bash mkdir -p /workspaces/isaac_ros-dev/models | Download the calibration cache file: .. note:: Check the model's page on NGC for the latest ``wget`` command. .. code:: bash wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt .. code:: bash /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt .. .. note:: The calibration cache file (specified using the ``-c`` option) is required to generate the ``int8`` engine file. This file is provided in the **File Browser** tab of the model's page on NGC. Using ``trtexec`` to convert an ONNX model to a TensorRT Plan File ------------------------------------------------------------------ Assuming that a model called ``model.onnx`` is available, the conversion is performed using: .. code:: /usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.plan .. warning:: Reading the documentation of `trtexec `__ is highly recommended to obtain best performance. In particular, we recommend pay attention to the quantization of the model (e.g. ``fp32`` vs ``fp16`` vs ``int8``). Inspecting The Input and Output Binding Names of a Model -------------------------------------------------------- Deep learning models have ``input_binding_names`` and ``output_binding_names``. These correspond to the model's inputs and outputs respectively. These are determined by the model itself during export. There are two methods one can perform to determine this, but the **recommended** way is using a TensorRT Plan File. .. note:: In addition, the ``TensorRTNode`` and ``TritonNode`` have parameters called ``input_tensor_names`` and ``output_tensor_names``, these correspond to the expected tensor names within the ROS 2 ``TensorList``. Using an ONNX Model File ~~~~~~~~~~~~~~~~~~~~~~~~ If an ONNX Model file is used, one can use `netron `__ to visualize the ONNX model, and note down the input and output names and dimensions. Using a TensorRT Plan File ~~~~~~~~~~~~~~~~~~~~~~~~~~ If a TensorRT Plan file is used, one can use NVIDIA's `polygraph `__ tool to determine it. 1. Install TensorRT's Python binding and the polygraph tool: .. code:: bash pip install tensorrt tensorrt_bindings pip install colored polygraphy --extra-index-url https://pypi.ngc.nvidia.com 2. Add ``/home/admin/.local/bin`` to your ``PATH`` to use ``polygraph`` more conveniently: .. code:: bash export PATH="/home/admin/.local/bin:$PATH" 3. Obtain the desired model. In this case, we'll show how to get the ``PeopleSemSegnet ShuffleSeg`` network: .. code:: bash mkdir -p /tmp/models/peoplesemsegnet_shuffleseg/1 && \ cd /tmp/models/peoplesemsegnet_shuffleseg && \ wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_etlt.etlt && \ wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt 4. Convert the obtained model from an ``etlt`` file to a ``plan`` file (called ``model.plan``): .. code:: bash /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt 5. Now go to the directory where we obtained the ``PeopleSemSegNet ShuffleSeg`` model: .. code:: bash cd /tmp/models/peoplesemsegnet_shuffleseg/1 6. Now use ``polygraph`` to inspect the names of the inputs and outputs of the model. In this case, the model we obtained is called ``model.plan``: .. code:: bash polygraphy inspect model model.plan The expected output should look like this: .. code:: bash [I] Loading bytes from /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan [I] ==== TensorRT Engine ==== Name: Unnamed Network 0 | Explicit Batch Engine ---- 1 Engine Input(s) ---- {input_2:0 [dtype=float32, shape=(1, 3, 544, 960)]} ---- 1 Engine Output(s) ---- {argmax_1 [dtype=int32, shape=(1, 544, 960, 1)]} ---- Memory ---- Device Memory: 21269504 bytes ---- 1 Profile(s) (2 Tensor(s) Each) ---- - Profile: 0 Tensor: input_2:0 (Input), Index: 0 | Shapes: min=(1, 3, 544, 960), opt=(1, 3, 544, 960), max=(1, 3, 544, 960) Tensor: argmax_1 (Output), Index: 1 | Shape: (1, 544, 960, 1) ---- 73 Layer(s) ---- In this case, the ``input_binding_names`` for this network is ``['input_2:0']``, whereas the ``output_binding_names`` is ``['argmax_1']``. The shape of each dimension can also be observed from this command. These values can be taken and used as the ``input_binding_names`` and ``output_binding_names`` for the ``TensorRTNode`` or ``TritonNode``. If a model has multiple inputs or outputs, these must be passed in as a string list of all the values. Once again, ensure that the ``TensorRTNode`` or ``TritonNode``'s ``input_tensor_names`` and ``output_tensor_names`` parameters are correctly set according to the names of the ROS 2 ``TensorList`` message obtained from any upstream nodes or expected by any downstream nodes.