isaac_ros_segment_anything

Source code on GitHub.

Quickstart

Set Up Development Environment

  1. Set up your development environment by following the instructions in getting started.

  2. Clone isaac_ros_common under ${ISAAC_ROS_WS}/src.

    cd ${ISAAC_ROS_WS}/src && \
       git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common.git
    
  3. (Optional) Install dependencies for any sensors you want to use by following the sensor-specific guides.

    Warning

    We strongly recommend installing all sensor dependencies before starting any quickstarts. Some sensor dependencies require restarting the Isaac ROS Dev container during installation, which will interrupt the quickstart process.

Download Quickstart Assets

  1. Download quickstart data from NGC:

    Make sure required libraries are installed.

    sudo apt-get install -y curl tar
    

    Then, run these commands to download the asset from NGC.

    NGC_ORG="nvidia"
    NGC_TEAM="isaac"
    NGC_RESOURCE="isaac_ros_assets"
    NGC_VERSION="isaac_ros_segment_anything"
    NGC_FILENAME="quickstart.tar.gz"
    
    REQ_URL="https://api.ngc.nvidia.com/v2/resources/$NGC_ORG/$NGC_TEAM/$NGC_RESOURCE/versions/$NGC_VERSION/files/$NGC_FILENAME"
    
    mkdir -p ${ISAAC_ROS_WS}/isaac_ros_assets/${NGC_VERSION} && \
        curl -LO --request GET "${REQ_URL}" && \
        tar -xf ${NGC_FILENAME} -C ${ISAAC_ROS_WS}/isaac_ros_assets/${NGC_VERSION} && \
        rm ${NGC_FILENAME}
    

Build isaac_ros_segment_anything

  1. Launch the Docker container using the run_dev.sh script:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
    ./scripts/run_dev.sh
    
  2. Install the prebuilt Debian package:

    sudo apt-get install -y ros-humble-isaac-ros-segment-anything
    

Prepare Segment Anything ONNX Model

There are two Segment Anything models available to select from in the steps below: SAM and Mobile SAM. SAM provides the full accuracy of Segment Anything but can consume over 8GB of GPU memory and take more computation effort for inference. Mobile SAM, has been tuned to operate with a much smaller memory footprint and less computation overhead, but with some level of degradation in quality. Choose the model which will be the most effective for your use case.

  1. Make a directory to place models (inside the Docker container):

    mkdir -p ${ISAAC_ROS_WS}/isaac_ros_assets/models/segment_anything/1/
    
  2. Create the ONNX file for SAM or for Mobile SAM.

Note

SAM is only supported with Triton using the ONNX backend. The following step is to be performed only on x86_64. You must copy the generated ONNX file to a Jetson, in case you want to run the package on the device.

Note

It is recommended to run SAM with at least 12GB+ GPU memory. Mobile SAM smaller version of SAM which can be used if GPU VRAM is not sufficient to run SAM.

Steps to use SAM

To create ONNX file for SAM. Download the PyTorch weights from Segment Anything official repo. Then, copy the downloaded model weights into the container. The model is expected to be downloaded to ~/Downloads outside the Docker container.

Move this file to ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_segment_anything/ using a console outside of the container and then continue inside the container.

For example, if the model vit_b.pth was downloaded to ~/Downloads:

On x86_64:

# Outside container
mv ~/Downloads/</path/to/vit_b> ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_segment_anything/vit_b.pth
# Inside container
cd ${ISAAC_ROS_WS}
pip install git+https://github.com/facebookresearch/segment-anything.git
# Inside container
ros2 run isaac_ros_segment_anything torch_to_onnx.py \
--checkpoint ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_segment_anything/vit_b.pth \
--output ${ISAAC_ROS_WS}/isaac_ros_assets/models/segment_anything/1/model.onnx --model-type vit_b --sam-type SAM
  1. Copy the config file:

    cp ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_segment_anything/sam_config_onnx.pbtxt ${ISAAC_ROS_WS}/isaac_ros_assets/models/segment_anything/config.pbtxt
    

Run Launch File

Segment Anything requires a prompt to indicate what in the image we want to segment. In this example, we will use YOLOv8 object detection to determine the image space bounding box of an object of interest to use as the prompt for Segment Anything to create an image segmentation mask for.

  1. Continuing inside the Docker container, install the following dependencies:

    sudo apt-get install -y ros-humble-isaac-ros-examples
    
  2. Run the following launch file to spin up a demo of this package:

    cd ${ISAAC_ROS_WS} && \
       ros2 launch isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=segment_anything interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_segment_anything/quickstart_interface_specs.json sam_model_repository_paths:=[${ISAAC_ROS_WS}/isaac_ros_assets/models]
    
  3. Then open another terminal, and enter the Docker container again:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
       ./scripts/run_dev.sh
    
  4. Then, play the ROS bag:

    ros2 bag play -l ${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_segment_anything/segment_anything_sample_data/
    

Visualize Results

  1. Open a new terminal inside the Docker container:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
       ./scripts/run_dev.sh
    
  2. Run the Python script to generate the colored segmentation mask from raw mask.

    ros2 run isaac_ros_segment_anything visualize_mask.py
    
  3. Visualize and validate the output of the package by launching rqt_image_view. In another terminal enter the Docker container:

    cd ${ISAAC_ROS_WS}/src/isaac_ros_common && \
       ./scripts/run_dev.sh
    

    Then launch rqt_image_view:

    ros2 run rqt_image_view rqt_image_view
    

    Inside the rqt_image_view GUI, change the topic to /segment_anything/colored_segmentation_mask to view a colorized segmentation mask.

    https://media.githubusercontent.com/media/NVIDIA-ISAAC-ROS/.github/main/resources/isaac_ros_docs/repositories_and_packages/isaac_ros_image_segmentation/isaac_ros_segment_anything/sam_output_rqt.png/

    Note

    The raw segmentation mask is also published to /segment_anything/raw_segmentation_mask. However, the raw pixels correspond to the class labels and so the output is unsuitable for human visual inspection.

Note

Segment Anything is designed to perform segmentation on unseen objects, and requires significant GPU compute to perform this task. Refer to performance of the model, for your target platform. High performance discrete GPU’s can outperform Jetson platforms for this task, and should be used if higher performance is needed.

Note

Segment Anything is designed to perform image segmentation on previously unseen objects without model retraining. As a result, inference requires significant GPU compute for this task. Refer to performance of the model for your target platform to determine which model variants to use.

More powerful discrete GPUs can outperform all other platforms for this task and should be preferred if higher performance is required. Interleaving image segmentation with other tasks rather than running continuously can be a more effective solution as well. Finally, if runtime performance is critical and offline training resources are available, developers can train Unet for their own target objects using synthetic data generation and/or real-world data for faster image segmentation.

Try More Examples

To continue your exploration, check out the following suggested examples:

Troubleshooting

Isaac ROS Troubleshooting

For solutions to problems with Isaac ROS, see here.

Deep Learning Troubleshooting

For solutions to problems with using DNN models, see here.

API

Usage

A single launch file is provided for this package. The launch file launches isaac_ros_triton with ONNX backend. Only the ONNX backend with Triton is supported for this model.

Warning

For your specific application, these launch files may need to be modified. Please consult the available components to see the configurable parameters.

Launch File

Components Used

isaac_ros_segment_anything_triton.launch.py

ResizeNode, PadNode, ImageFormatConverterNode, ImageToTensorNode, ImageTensorNormalizeNode, InterleavedToPlanarNode, ReshapeNode, TritonNode, SegmentAnythingDecoderNode, SegmentAnythingDataEncoderNode, DummyMaskPublisher

SegmentAnythingDecoderNode

ROS Parameters

ROS Parameter

Type

Default

Description

mask_width

int16_t

960

The width of the segmentation mask.

mask_height

int16_t

544

The height of the segmentation mask.

max_batch_size

int16_t

20

Maximum number of prompt inputs for each frame.

Warning

  • Note: If a frame does not have any input prompt then no mask is generated for that particular frame.

  • Note: If for a frame input prompts are larger than the max_batch_size then first max_batch_size input prompts will be considered.

ROS Topics Subscribed

ROS Topic

Interface

Description

tensor_sub

isaac_ros_tensor_list_interfaces/TensorList

List of Output Tensors from the SAM model.

Warning

All input images are required to have height and width that are both an even number of pixels.

ROS Topics Published

ROS Topic

Interface

Description

/segment_anything/raw_segmentation_mask

isaac_ros_tensor_list_interfaces/TensorList

The raw segmentation mask, A Tensor with shape [batch_size, 1, orig_img_height, orig_img_width]. Where batch_size is number of prompt inputs for that frame.

SegmentAnythingDataEncoderNode

ROS Parameters

ROS Parameter

Type

Default

Description

prompt_input_type

string

bbox

Type of Prompt Input. Supported Types: bbox and point

has_input_mask

bool

false

Whether there is any input mask for SAM or not.

max_batch_size

int16_t

20

Maximum number of prompt inputs for each frame.

orig_img_dims

double list

[632,1200]

Dimensions of the Original Image in [H,W] format. Please note, expectation is each bbox/point is in the coordinate system of given dimension image.

ROS Topics Subscribed

ROS Topic

Interface

Description

/prompts

vision_msgs/Detection2DArray

Input Detection2DArray, If prompt type is point then center of bbox is considered as point of interest.

/tensor_pub

isaac_ros_tensor_list_interfaces/TensorList

The preprocessed & encoded image tensor.

/mask

isaac_ros_tensor_list_interfaces/TensorList

Input mask for SAM.

ROS Topics Published

ROS Topic

Interface

Description

/tensor

isaac_ros_tensor_list_interfaces/TensorList

Tensor List which has all the required tensors for SAM inference.

DummyMaskPublisher

This node can be used to publish dummy mask to SAM in case there is no mask input to use.

ROS Parameters

ROS Parameter

Type

Default

Description

tensor_name

string

input_mask

Name of the mask tensor.

ROS Topics Subscribed

ROS Topic

Interface

Description

/tensor_pub

isaac_ros_tensor_list_interfaces/TensorList

The data from this topic is used to timestamp the dummy mask.

ROS Topics Published

ROS Topic

Interface

Description

/mask

isaac_ros_tensor_list_interfaces/TensorList

Dummy mask is published on this topic. Which is supposed to be ingested by SegmentAnythingDataEncoderNode