isaac_ros_grounding_dino#
Source code available on GitHub.
Quickstart#
Set Up Development Environment#
Set up your development environment by following the instructions in getting started.
(Optional) Install dependencies for any sensors you want to use by following the sensor-specific guides.
Note
We strongly recommend installing all sensor dependencies before starting any quickstarts. Some sensor dependencies require restarting the development environment during installation, which will interrupt the quickstart process.
Download Quickstart Assets#
Download quickstart data from NGC:
Make sure required libraries are installed.
sudo apt-get install -y curl jq tar
Then, run these commands to download the asset from NGC:
NGC_ORG="nvidia" NGC_TEAM="isaac" PACKAGE_NAME="isaac_ros_grounding_dino" NGC_RESOURCE="isaac_ros_grounding_dino_assets" NGC_FILENAME="quickstart.tar.gz" MAJOR_VERSION=4 MINOR_VERSION=0 VERSION_REQ_URL="https://catalog.ngc.nvidia.com/api/resources/versions?orgName=$NGC_ORG&teamName=$NGC_TEAM&name=$NGC_RESOURCE&isPublic=true&pageNumber=0&pageSize=100&sortOrder=CREATED_DATE_DESC" AVAILABLE_VERSIONS=$(curl -s \ -H "Accept: application/json" "$VERSION_REQ_URL") LATEST_VERSION_ID=$(echo $AVAILABLE_VERSIONS | jq -r " .recipeVersions[] | .versionId as \$v | \$v | select(test(\"^\\\\d+\\\\.\\\\d+\\\\.\\\\d+$\")) | split(\".\") | {major: .[0]|tonumber, minor: .[1]|tonumber, patch: .[2]|tonumber} | select(.major == $MAJOR_VERSION and .minor <= $MINOR_VERSION) | \$v " | sort -V | tail -n 1 ) if [ -z "$LATEST_VERSION_ID" ]; then echo "No corresponding version found for Isaac ROS $MAJOR_VERSION.$MINOR_VERSION" echo "Found versions:" echo $AVAILABLE_VERSIONS | jq -r '.recipeVersions[].versionId' else mkdir -p ${ISAAC_ROS_WS}/isaac_ros_assets && \ FILE_REQ_URL="https://api.ngc.nvidia.com/v2/resources/$NGC_ORG/$NGC_TEAM/$NGC_RESOURCE/\ versions/$LATEST_VERSION_ID/files/$NGC_FILENAME" && \ curl -LO --request GET "${FILE_REQ_URL}" && \ tar -xf ${NGC_FILENAME} -C ${ISAAC_ROS_WS}/isaac_ros_assets && \ rm ${NGC_FILENAME} fi
Build isaac_ros_grounding_dino#
Activate the Isaac ROS environment:
isaac-ros activateInstall the prebuilt Debian package:
sudo apt-get update
sudo apt-get install -y ros-jazzy-isaac-ros-grounding-dino && \ sudo apt-get install -y ros-jazzy-isaac-ros-grounding-dino-models-install
Download and set up (convert ONNX to TensorRT engine plan) the pre-trained Grounding DINO model:
sudo apt-get update
ros2 run isaac_ros_grounding_dino_models_install install_grounding_dino_models.sh --eula
Note
The time taken to convert the ONNX model to a TensorRT engine plan varies across different platforms. On Jetson AGX Thor, for example, the engine conversion process can take up to 10-15 minutes to complete.
Install Git LFS:
sudo apt-get install -y git-lfs && git lfs install
Clone this repository under
${ISAAC_ROS_WS}/src:cd ${ISAAC_ROS_WS}/src && \ git clone -b release-4.0 https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_object_detection.git isaac_ros_object_detection
Activate the Isaac ROS environment:
isaac-ros activateUse
rosdepto install the package’s dependencies:sudo apt-get update
rosdep update && rosdep install --from-paths ${ISAAC_ROS_WS}/src/isaac_ros_object_detection/isaac_ros_grounding_dino --ignore-src -y
Download and set up (convert ONNX to TensorRT engine plan) the pre-trained Grounding DINO model:
sudo apt-get install -y ros-jazzy-isaac-ros-grounding-dino-models-install && \ ros2 run isaac_ros_grounding_dino_models_install install_grounding_dino_models.sh --eula
Note
The time taken to convert the ONNX model to a TensorRT engine plan varies across different platforms. On Jetson AGX Thor, for example, the engine conversion process can take up to 10-15 minutes to complete.
Build the package from source:
cd ${ISAAC_ROS_WS} && \ colcon build --symlink-install --packages-up-to isaac_ros_grounding_dino --base-paths ${ISAAC_ROS_WS}/src/isaac_ros_object_detection/isaac_ros_grounding_dino
Source the ROS workspace:
Note
Make sure to repeat this step in every terminal created inside the Isaac ROS environment.
Since this package was built from source, the enclosing workspace must be sourced for ROS to be able to find the package’s contents.
source install/setup.bash
Run Launch File#
Continuing inside the Isaac ROS environment, install the following dependencies:
sudo apt-get update
sudo apt-get install -y ros-jazzy-isaac-ros-examples
Run the following launch file to spin up a demo of this package using the quickstart rosbag:
ros2 launch isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=grounding_dino interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_grounding_dino/quickstart_interface_specs.json model_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/grounding_dino/grounding_dino_model.onnx engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/grounding_dino/grounding_dino_model.plan
Open a second terminal inside the Isaac ROS environment:
isaac-ros activateRun the rosbag file to simulate an image stream:
ros2 bag play -l ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_grounding_dino/quickstart.bag
Ensure that you have already set up your RealSense camera using the RealSense setup tutorial. If you have not, please set up the sensor and then restart this quickstart from the beginning.
Continuing inside the Isaac ROS environment, install the following dependencies:
sudo apt-get update
sudo apt-get install -y ros-jazzy-isaac-ros-examples ros-jazzy-isaac-ros-realsense
Run the following launch file to spin up a demo of this package using a RealSense camera:
ros2 launch isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=realsense_mono_rect,grounding_dino model_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/grounding_dino/grounding_dino_model.onnx engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/grounding_dino/grounding_dino_model.plan
Ensure that you have already set up your ZED camera using ZED setup tutorial.
Continuing inside the Isaac ROS environment, install dependencies:
sudo apt-get update
sudo apt-get install -y ros-jazzy-isaac-ros-examples ros-jazzy-isaac-ros-image-proc ros-jazzy-isaac-ros-zed
Run the following launch file to spin up a demo of this package using a ZED Camera:
ros2 launch isaac_ros_examples isaac_ros_examples.launch.py \ launch_fragments:=zed_mono_rect,grounding_dino model_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/grounding_dino/grounding_dino_model.onnx engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/grounding_dino/grounding_dino_model.plan \ interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_grounding_dino/zed2_quickstart_interface_specs.json
Note
If you are using the ZED X series, replace zed2_quickstart_interface_specs.json with zedx_quickstart_interface_specs.json in the above command.
Visualize Results#
Open a new terminal inside the Isaac ROS environment:
isaac-ros activateRun the Grounding DINO visualization script:
ros2 run isaac_ros_grounding_dino isaac_ros_grounding_dino_visualizer.py
Open another terminal inside the Isaac ROS environment:
isaac-ros activateVisualize and validate the output of the package with
rqt_image_view:ros2 run rqt_image_view rqt_image_view /grounding_dino_processed_image
Your output should look like this:
Open another terminal inside the Isaac ROS environment:
isaac-ros activateDynamically change the prompt using the
set_promptservice:ros2 service call /set_prompt isaac_ros_grounding_dino_interfaces/srv/SetPrompt "{prompt: 'shopping bag.person.'}"
Note
Grounding DINO is designed to perform object detection on previously unseen objects without model retraining. As a result, inference requires significant GPU compute. Refer to performance of the model for your target platform to determine which model to use.
More powerful discrete GPUs can outperform all other platforms for this task and should be preferred if higher performance is required. Interleaving object detection with other tasks rather than running continuously can be a more effective solution as well. Finally, if runtime performance is critical and offline training resources are available, developers can train RT-DETR for their own target objects using synthetic data generation and/or real-world data for faster object detection.
Troubleshooting#
Isaac ROS Troubleshooting#
For solutions to problems with Isaac ROS, see troubleshooting.
Deep Learning Troubleshooting#
For solutions to problems with using DNN models, see troubleshooting deeplearning.
API#
Usage#
ros2 launch isaac_ros_grounding_dino isaac_ros_grounding_dino.launch.py model_file_path:=<path to .onnx> engine_file_path:=<path to .plan> input_image_width:=<input image width> input_image_height:=<input image height> network_image_width:=<network image width> network_image_height:=<network image height> input_tensor_names:=<input tensor names> input_binding_names:=<input binding names> output_tensor_names:=<output tensor names> output_binding_names:=<output binding names> tensorrt_verbose:=<TensorRT verbosity> force_engine_update:=<force TensorRT update> confidence_threshold:=<confidence threshold>
GroundingDinoPreprocessorNode#
ROS Parameters#
ROS Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
The default text prompt to use for object detection. |
|
|
|
The timeout (in seconds) for the GetTextTokens service call. |
|
|
|
The timeout (in seconds) for the GetTextTokens service discovery. |
ROS Topics Subscribed#
ROS Topic |
Interface |
Description |
|---|---|---|
|
The tensor that contains the encoded image data. |
ROS Topics Published#
ROS Topic |
Interface |
Description |
|---|---|---|
|
Tensor list containing encoded image and text tensors. |
ROS Services Advertised#
ROS Service |
Interface |
Description |
|---|---|---|
|
Service to set a new text prompt for object detection. |
ROS Services Requested#
ROS Service |
Interface |
Description |
|---|---|---|
|
Service to tokenize text prompts and generate positive maps. |
|
|
Service to synchronize class IDs and positive maps between the pre-processor and decoder nodes. |
GroundingDinoTextTokenizer#
ROS Services Advertised#
ROS Service |
Interface |
Description |
|---|---|---|
|
Service to tokenize text prompts into BERT tokens and generate positive maps for label-to-token mapping. |
GroundingDinoDecoderNode#
ROS Parameters#
ROS Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
The name of the boxes tensor binding in the input tensor list. |
|
|
|
The name of the scores tensor binding in the input tensor list. |
|
|
|
The minimum score required for a particular bounding box to be published. |
|
|
|
The width of the resized image, for scaling the final bounding box to match the original dimensions. |
|
|
|
The height of the resized image, for scaling the final bounding box to match the original dimensions. |
ROS Topics Subscribed#
ROS Topic |
Interface |
Description |
|---|---|---|
|
The tensor that represents the inferred aligned bounding boxes and scores. |
ROS Topics Published#
ROS Topic |
Interface |
Description |
|---|---|---|
|
Aligned image bounding boxes with detection class. |
ROS Services Advertised#
ROS Service |
Interface |
Description |
|---|---|---|
|
Service to receive and synchronize class IDs and positive maps from the pre-processor node. |