============================================
Preparing Deep Learning Models for Isaac ROS
============================================

Obtaining a Pre-trained Model from NGC
--------------------------------------

The NVIDIA GPU Cloud hosts a
`catalog <https://catalog.ngc.nvidia.com/models>`__ of Deep Learning
pre-trained models that are available for your development.

1. Use the **Search Bar** to find a pre-trained model that you are
   interested in working with.

2. Click on the model's card to view an expanded description, and then
   click on the **File Browser** tab along the navigation bar.

3. Using the **File Browser**, find a deployable ``.etlt`` file for the
   model you are interested in.

      **Note:** The ``.etlt`` file extension indicates that this model
      has pre-trained but **encrypted** weights, which means one needs
      to use the ``tao-converter`` utility to decrypt the model, as
      described :ref:`below <tao-converter>`.

4. Under the **Actions** heading, click on the **…** icon for the file
   you selected in the previous step, and then click **Copy** ``wget``
   **command**.
5. **Paste** the copied command into a terminal to download the model in
   the current working directory.

.. _tao-converter:

Using ``tao-converter`` to decrypt the Encrypted TLT Model (``.etlt``) Format
-----------------------------------------------------------------------------

As discussed above, models distributed with the ``.etlt`` file extension
are encrypted and must be decrypted before use via NVIDIA's
`tao-converter <https://developer.nvidia.com/tao-toolkit-get-started>`__.

``tao-converter`` is already included in the Docker images available as
part of the standard :doc:`Isaac ROS Development Environment </getting_started/dev_env_setup>`.

The per-platform installation paths are described below:

==================== =============================================================== =====================================
Platform             Installation Path                                               Symlink Path
==================== =============================================================== =====================================
``x86_64``           ``/opt/nvidia/tao/tao-converter-x86-tensorrt8.0/tao-converter`` ``/opt/nvidia/tao/tao-converter``
Jetson (``aarch64``) ``/opt/nvidia/tao/jp5``                                         ``/opt/nvidia/tao/tao-converter``
==================== =============================================================== =====================================

Converting ``.etlt`` to a TensorRT Engine Plan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here are some examples for generating the TensorRT engine file using
``tao-converter``. In this example, we will use the
`PeopleSemSegnet Shuffleseg model <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/peoplesemsegnet/files?version=deployable_shuffleseg_unet_v1.0>`__:

Generate an engine file for the ``fp16`` data type
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code:: bash

   mkdir -p /workspaces/isaac_ros-dev/models && \
      /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t fp16 -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt

.. note::
   The specific values used in the command above are retrieved from the **PeopleSemSegnet**
   page under the **Overview** tab.The
   model input node name and output node name can be found in
   ``peoplesemsegnet_shuffleseg_cache.txt`` from ``File Browser``. The
   output file is specified using the ``-e`` option. The tool needs
   write permission to the output directory.

   A detailed explanation of the input parameters is available
   `here <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/resources/tao-converter>`__.

Generate an engine file for the data type ``int8``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Create the models directory:

.. code:: bash

   mkdir -p /workspaces/isaac_ros-dev/models

| Download the calibration cache file:

.. note::

   Check the model's page on NGC for the latest ``wget``
   command.

.. code:: bash

   wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt

.. code:: bash

   /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt

..

.. note::

   The calibration cache file (specified using the ``-c``
   option) is required to generate the ``int8`` engine file. This file
   is provided in the **File Browser** tab of the model's page on NGC.

Using ``trtexec`` to convert an ONNX model to a TensorRT Plan File
------------------------------------------------------------------

Assuming that a model called ``model.onnx`` is available, the conversion is performed using:

.. code::

   /usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.plan

.. warning::

   Reading the documentation of `trtexec <https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec>`__ is highly recommended to
   obtain best performance. In particular, we recommend pay attention to the quantization of the model (e.g. ``fp32`` vs ``fp16`` vs ``int8``).

Inspecting The Input and Output Binding Names of a Model
--------------------------------------------------------

Deep learning models have ``input_binding_names`` and ``output_binding_names``. These correspond to the model's inputs and outputs
respectively. These are determined by the model itself during export. There are two methods one can perform to determine this, but
the **recommended** way is using a TensorRT Plan File.

.. note::

   In addition, the ``TensorRTNode`` and ``TritonNode`` have parameters called ``input_tensor_names`` and ``output_tensor_names``,
   these correspond to the expected tensor names within the ROS 2 ``TensorList``.

Using an ONNX Model File
~~~~~~~~~~~~~~~~~~~~~~~~

If an ONNX Model file is used, one can use `netron <https://netron.app/>`__ to visualize the ONNX model, and note down the input and output names and dimensions.

Using a TensorRT Plan File
~~~~~~~~~~~~~~~~~~~~~~~~~~

If a TensorRT Plan file is used, one can use NVIDIA's `polygraph <https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy>`__ tool to determine it.

1. Install TensorRT's Python binding and the polygraph tool:

   .. code:: bash

      pip install tensorrt tensorrt_bindings
      pip install colored polygraphy --extra-index-url https://pypi.ngc.nvidia.com

2. Add ``/home/admin/.local/bin`` to your ``PATH`` to use ``polygraph`` more conveniently:

   .. code:: bash

      export PATH="/home/admin/.local/bin:$PATH"

3. Obtain the desired model. In this case, we'll show how to get the ``PeopleSemSegnet ShuffleSeg`` network:

   .. code:: bash

      mkdir -p /tmp/models/peoplesemsegnet_shuffleseg/1 && \
        cd /tmp/models/peoplesemsegnet_shuffleseg && \
        wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_etlt.etlt && \
        wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt

4. Convert the obtained model from an ``etlt`` file to a ``plan`` file (called ``model.plan``):

   .. code:: bash

      /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt

5. Now go to the directory where we obtained the ``PeopleSemSegNet ShuffleSeg`` model:

   .. code:: bash

      cd /tmp/models/peoplesemsegnet_shuffleseg/1

6. Now use ``polygraph`` to inspect the names of the inputs and outputs of the model. In this case, the model we obtained is called ``model.plan``:

   .. code:: bash

      polygraphy inspect model model.plan
   
   The expected output should look like this:

   .. code:: bash

      [I] Loading bytes from /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan
      [I] ==== TensorRT Engine ====
        Name: Unnamed Network 0 | Explicit Batch Engine

        ---- 1 Engine Input(s) ----
        {input_2:0 [dtype=float32, shape=(1, 3, 544, 960)]}

        ---- 1 Engine Output(s) ----
        {argmax_1 [dtype=int32, shape=(1, 544, 960, 1)]}

        ---- Memory ----
        Device Memory: 21269504 bytes

        ---- 1 Profile(s) (2 Tensor(s) Each) ----
        - Profile: 0
            Tensor: input_2:0          (Input), Index: 0 | Shapes: min=(1, 3, 544, 960), opt=(1, 3, 544, 960), max=(1, 3, 544, 960)
            Tensor: argmax_1          (Output), Index: 1 | Shape: (1, 544, 960, 1)

        ---- 73 Layer(s) ----


  In this case, the ``input_binding_names`` for this network is ``['input_2:0']``, whereas the ``output_binding_names`` is ``['argmax_1']``.
  The shape of each dimension can also be observed from this command.

  These values can be taken and used as the ``input_binding_names`` and ``output_binding_names`` for the ``TensorRTNode`` or ``TritonNode``. If a model has multiple inputs or outputs, these must be passed in as
  a string list of all the values. Once again, ensure that the ``TensorRTNode`` or ``TritonNode``'s ``input_tensor_names`` and ``output_tensor_names`` parameters are correctly
  set according to the names of the ROS 2 ``TensorList`` message obtained from any upstream nodes or expected by any downstream nodes.