isaac_ros_foundationpose

Source code on GitHub.

Quickstart

Set Up Development Environment

  1. Set up your development environment by following the instructions in getting started.

  2. Clone isaac_ros_common under ${ISAAC_ROS_WS}/src.

    cd ${ISAAC_ROS_WS}/src && \
       git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common.git
    
  3. (Optional) Install dependencies for any sensors you want to use by following the sensor-specific guides.

    Warning

    We strongly recommend installing all sensor dependencies before starting any quickstarts. Some sensor dependencies require restarting the Isaac ROS Dev container during installation, which will interrupt the quickstart process.

Download Quickstart Assets

  1. Download quickstart data from NGC:

    Make sure required libraries are installed.

    sudo apt-get install -y curl tar
    

    Then, run these commands to download the asset from NGC.

    NGC_ORG="nvidia"
    NGC_TEAM="isaac"
    NGC_RESOURCE="isaac_ros_assets"
    NGC_VERSION="isaac_ros_foundationpose"
    NGC_FILENAME="quickstart.tar.gz"
    
    REQ_URL="https://api.ngc.nvidia.com/v2/resources/$NGC_ORG/$NGC_TEAM/$NGC_RESOURCE/versions/$NGC_VERSION/files/$NGC_FILENAME"
    
    mkdir -p ${ISAAC_ROS_WS}/isaac_ros_assets/${NGC_VERSION} && \
        curl -LO --request GET "${REQ_URL}" && \
        tar -xf ${NGC_FILENAME} -C ${ISAAC_ROS_WS}/isaac_ros_assets/${NGC_VERSION} && \
        rm ${NGC_FILENAME}
    
  2. Download the pre-trained FoundationPose models from NGC:

    wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/isaac/foundationpose/versions/1.0.0/zip -O foundationpose_1.0.0.zip
    
  3. Copy the models to the required directory:

    mkdir -p ${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose && \
       unzip foundationpose_1.0.0.zip -d ${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose
    

Build isaac_ros_foundationpose

  1. Launch the Docker container using the run_dev.sh script:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
    ./scripts/run_dev.sh
    
  2. Install the prebuilt Debian package:

    sudo apt-get install -y ros-humble-isaac-ros-foundationpose
    

Run Launch File

  1. Inside the container, convert models from .etlt to TensorRT engine plans:

    Convert the refine model:

    /opt/nvidia/tao/tao-converter -k foundationpose -t fp16 -e ${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/refine_trt_engine.plan -p input1,1x160x160x6,1x160x160x6,252x160x160x6 -p input2,1x160x160x6,1x160x160x6,252x160x160x6 -o output1,output2 ${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/refine_model.etlt
    

    Convert the score model:

    /opt/nvidia/tao/tao-converter -k foundationpose -t fp16 -e ${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/score_trt_engine.plan -p input1,1x160x160x6,1x160x160x6,252x160x160x6 -p input2,1x160x160x6,1x160x160x6,252x160x160x6 -o output1 ${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/score_model.etlt
    

    Note

    The model conversion time varies across different platforms. On Jetson AGX Orin, the engine conversion process takes ~10-15 minutes to complete.

  1. Complete the Isaac ROS RT-DETR tutorial.

  2. Continuing inside the container, install the following dependencies:

    sudo apt-get install -y ros-humble-isaac-ros-examples
    
  3. Run the following launch files to spin up a demo of this package:

    Launch isaac_ros_foundationpose:

    ros2 launch isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=foundationpose interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/quickstart_interface_specs.json mesh_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mustard/textured_simple.obj texture_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mustard/texture_map.png score_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/score_trt_engine.plan refine_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/refine_trt_engine.plan rt_detr_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/synthetica_detr/sdetr_grasp.plan
    

    Then open another terminal, and enter the Docker container again:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
       ./scripts/run_dev.sh
    

    Then, play the ROS bag:

    ros2 bag play -l  ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/quickstart.bag/
    

Visualize Results

  1. Open a new terminal inside the Docker container:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
       ./scripts/run_dev.sh
    
    1. launch RViz2 to visualize the output

      rviz2 -d  $(ros2 pkg prefix isaac_ros_foundationpose --share)/rviz/foundationpose.rviz
      
    2. You should see a RViz2 window open as shown below showing the 3D bounding box overlaid over the input image

    https://media.githubusercontent.com/media/NVIDIA-ISAAC-ROS/.github/main/resources/isaac_ros_docs/repositories_and_packages/isaac_ros_pose_estimation/isaac_ros_foundationpose/foundation_pose_rviz2.png/

Note

FoundationPose is designed to perform pose estimation on previously unseen objects without model retraining. As a result, inference requires significant GPU compute for first detection. Tracking once an initial pose estimation is determined can be significantly faster, however. Refer to performance of the model for your target platform to determine which model to use.

More powerful discrete GPUs can outperform all other platforms for this task and should be preferred if higher performance is required. Interleaving pose estimation with other tasks rather than running continuously can be a more effective solution as well. Finally, if runtime performance is critical and offline training resources are available, developers can train CenterPose for their own target objects using synthetic data generation and/or real-world data for faster pose estimation.

Try More Examples

To continue your exploration, check out the following suggested examples:

Note

FoundationPose expects the origin frame to be at the center of the mesh.

Troubleshooting

Object Detections are Offset from the Actual Object

Symptom

The object detections are offset from the actual object.

Solution

This may be because the vertices/faces in the .obj file is not at the center of the object. Rather, it could be at the corner/edge of the object. To fix this, follow this tutorial.

Isaac ROS Troubleshooting

For solutions to problems with Isaac ROS, please check here.

Deep Learning Troubleshooting

For solutions to problems with using DNN models, please check here.

API

Usage

ros2 launch isaac_ros_foundationpose isaac_ros_foundationpose.launch.py refine_model_file_path:=<path to refine onnx model> refine_engine_file_path:=<path to refine model .plan> score_model_file_path:=<path to score onnx model> score_engine_file_path:=<path to score model .plan> mesh_file_path:=<path to object mesh file> texture_path:=<path to texture map> launch_rviz:=<enable rviz> launch_bbox_to_mask:=<enable bbox to mask converter> mask_height:=<converted mask height> mask_width:=<converted mask width>

FoundationPose Node

ROS Parameters

ROS Parameter

Type

Default

Description

mesh_file_path

string

textured_simple.obj

The absolute path to the target object mesh file.

texture_path

string

textured_map.png

The absolute Path to the target object texture file.

min_depth

float

0.1

Minimum allowed Z-axis value of pointcloud.

max_depth

float

0.4

Minimum allowed X,Y,Z-axis value of pointcloud to threshold.

refine_iterations

int

1

The number of iterations applied to the refinement model can enhance accuracy. However, more iterations will take longer time for processing.

refine_model_file_path

string

/tmp/refine_model.onnx

The absolute path to the refinement model file in the local file system.

refine_engine_file_path

string

/tmp/refine_trt_engine.plan

The absolute path to either where you want your refinement TensorRT engine plan to be generated or where your pre-generated engine plan file is located.

score_model_file_path

string

/tmp/score_model.onnx

The absolute path to your score model file in the local file system.

score_engine_file_path

string

/tmp/score_trt_engine.plan

The absolute path to either where you want your score TensorRT engine plan to be generated or where your pre-generated engine plan file is located.

refine_input_tensor_names

string

['']

A list of tensor names of refinement model to be bound to specified input bindings names. Bindings occur in sequential order.

refine_input_binding_names

string

['']

A list of input tensor binding names specified by the refinement model.

score_input_tensor_names

string

['']

A list of tensor names of score model to be bound to specified input bindings names. Bindings occur in sequential order.

score_input_binding_name

string

['']

A list of input tensor binding names specified by the score model.

refine_output_tensor_names

string

['']

A list of tensor names to be bound to specified output binding names for the refine model.

refine_output_binding_names

string

['']

A list of output tensor binding names specified by the refine model.

score_output_tensor_names

string

['']

A list of tensor names to be bound to specified output binding names for the score model.

score_output_binding_names

string

['']

A list of output tensor binding names specified by the score model.

tf_frame_name

string

fp_object

Name of the frame that is used when publishing the pose to the TF tree.

ROS Topics Subscribed

ROS Topic

Interface

Description

pose_estimation/depth_image

sensor_msgs/Image

The input depth image.

pose_estimation/segmentation

sensor_msgs/Image

The input segmentation mask.

pose_estimation/image

sensor_msgs/Image

The input color image (rectified).

pose_estimation/camera_info

sensor_msgs/CameraInfo

The input image camera_info.

ROS Topics Published

ROS Topic

Interface

Description

pose_estimation/output

vision_msgs/Detection3DArray

The output pose estimate.

pose_estimation/pose_matrix_output

isaac_ros_tensor_list_interfaces/TensorList

The output pose matrix, used as the input for next frame tracking.

FoundationPose Tracking Node

ROS Parameters

ROS Parameter

Type

Default

Description

mesh_file_path

string

textured_simple.obj

The absolute path to the target object mesh file.

texture_path

string

textured_map.png

The absolute Path to the target object texture file.

min_depth

float

0.1

Minimum allowed Z-axis value of pointcloud.

max_depth

float

0.4

Minimum allowed X,Y,Z-axis value of pointcloud to threshold.

refine_model_file_path

string

/tmp/refine_model.onnx

The absolute path to the refinement model file in the local file system.

refine_engine_file_path

string

/tmp/refine_trt_engine.plan

The absolute path to either where you want your refinement TensorRT engine plan to be generated or where your pre-generated engine plan file is located.

refine_input_tensor_names

string

['']

A list of tensor names of refinement model to be bound to specified input bindings names. Bindings occur in sequential order.

refine_input_binding_names

string

['']

A list of input tensor binding names specified by the refinement model.

refine_output_tensor_names

string

['']

A list of tensor names to be bound to specified output binding names for the refine model.

refine_output_binding_names

string

['']

A list of output tensor binding names specified by the refine model.

tf_frame_name

string

fp_object

Name of the frame that is used when publishing the pose to the TF tree.

ROS Topics Subscribed

ROS Topic

Interface

Description

tracking/depth_image

sensor_msgs/Image

The input depth image.

tracking/pose_input

isaac_ros_tensor_list_interfaces/TensorList

The input pose estimation matrix from last frame.

tracking/image

sensor_msgs/Image

The input color image (rectified).

tracking/camera_info

sensor_msgs/CameraInfo

The input image camera_info.

ROS Topics Published

ROS Topic

Interface

Description

tracking/output

vision_msgs/Detection3DArray

The output pose estimate.

tracking/pose_matrix_output

isaac_ros_tensor_list_interfaces/TensorList

The output pose matrix, used as the input for next frame tracking.

FoundationPose Selector Node

ROS Parameters

ROS Parameter

Type

Default

reset_period

int

20000

ROS Topics Subscribed

ROS Topic

Interface

Description

depth_image

sensor_msgs/Image

The input depth image.

segmentation

sensor_msgs/Image

The input segmentation mask.

image

sensor_msgs/Image

The input color image (rectified).

camera_info

sensor_msgs/CameraInfo

The input image camera_info.

tracking/pose_matrix_output

isaac_ros_tensor_list_interfaces/TensorList

The input pose matrix comes from last frame tracking.

pose_estimation/pose_matrix_output

isaac_ros_tensor_list_interfaces/TensorList

The input pose matrix comes from pose estimation.

ROS Topics Published

ROS Topic

Interface

Description

pose_estimation/depth_image

sensor_msgs/Image

The output depth image for pose estimation.

pose_estimation/segmentation

sensor_msgs/Image

The output segmentation mask for pose estimation.

pose_estimation/image

sensor_msgs/Image

The output color image (rectified) for pose estimation.

pose_estimation/camera_info

sensor_msgs/CameraInfo

The output image camera_info for pose estimation.

tracking/depth_image

sensor_msgs/Image

The output segmentation mask for tracking.

tracking/image

sensor_msgs/Image

The output color image (rectified) for tracking.

tracking/camera_info

sensor_msgs/CameraInfo

The output image camera_info for tracking.

tracking/pose_input

isaac_ros_tensor_list_interfaces/TensorList

The output pose matrix, used as the input for next frame tracking.

FoundationPose Detection2DToMask Node

ROS Parameters

ROS Parameter

Type

Default

Description

mask_width

int

640

The output mask width. FoundationPose expects the mask to have the same dimensions as the RGB image.

mask_height

int

480

The output mask height. FoundationPose expects the mask to have the same dimensions as the RGB image.

ROS Topics Subscribed

ROS Topic

Interface

Description

detection2_d

vision_msgs/Detection2D.msg

The input Detection2D message.

detection2_d_array

vision_msgs/Detection2DArray.msg

The input Detection2D Array message. The mask is created using the highest-scoring detection from the input Detection2D Array message.

ROS Topics Published

ROS Topic

Interface

Description

segmentation

sensor_msgs/Image

The output binary segmentation mask.