cuVSLAM

All SLAM-related operations work in parallel to visual odometry in a separate thread. Input images get copied into GPU and then cuVSLAM starts tracking.

cuVSLAM uses the following:

  • 2D features on the input images

  • Landmarks: The patch of pixels in the source images with coordinates of the feature associated with the position of the camera from where that feature is visible.

  • PoseGraph: A graph that tracks the poses of the camera from where the landmarks are viewed.

Landmarks and PoseGraph are stored in a data structure that does not increase its size in the case where the same landmark is visited more than once.

Imagine a robot moving around and returning to the same place. Because odometry always has some accumulated drift, a deviation in trajectory from the ground truth can be seen. SLAM can be used to solve this problem.

During the robot’s motion, SLAM stores all landmarks and it continuously verifies whether each landmark has been seen before. When cuVSLAM recognizes several landmarks, it determines their position from the data structure.

At the moment of recognition, a connection is added to the PoseGraph, which makes a loop from the free trail of the poses. This event is called a ‘loop closure’. Immediately after that, cuVSLAM performs a graph optimization that leads to the correction of the current odometric pose and all previous poses in the graph.

The procedure for adding landmarks is designed such that if a landmark is not in the expected location, then it is marked for eventual deletion. This allows you to use cuVSLAM over a changing terrain.

Along with visual data, cuVSLAM can use Inertial Measurement Unit (IMU) measurements. It automatically switches to IMU when VO is unable to estimate a pose. For example, when there is dark lighting or long solid surfaces in front of a camera. Typically, using an IMU leads to a significant performance improvement in poor visual conditions.

For severe degradation of image input (such as lights being turned off, dramatic motion blur on a bump while driving), you can explore the following motion estimation algorithms to ensure acceptable quality for pose tracking:

  • The IMU readings integrator provides acceptable pose tracking quality for about ~< 1 seconds.

  • For IMU failure, the constant velocity integrator continues to provide the last linear and angular velocities reported by Stereo VIO before failure. This provides acceptable pose tracking quality for ~0.5 seconds.

List of Useful Visualizations

  • visual_slam/vis/observations_cloud - Point cloud for 2D Features

  • visual_slam/vis/landmarks_cloud - All mapped landmarks

  • visual_slam/vis/loop_closure_cloud - Landmarks visible in last loop closure

  • visual_slam/vis/pose_graph_nodes - Pose graph nodes

  • visual_slam/vis/pose_graph_edges - Pose graph edges

Saving the Map

Optimally, save the stored landmarks and pose graph in a map using the ROS 2 SaveMap action that saves the map to the disk.

Loading and Localization in the Map

After the map has been saved to the disk, it can be used to localize the robot. To load the map into the memory, we have made a ROS 2 action called LoadMapAndLocalize. It requires a map file path and a prior pose, which is an initial guess of where the robot is in the map. Given the prior pose and current set of camera frames, cuVSLAM tries to find the pose of the landmarks in the requested map that matches the current set. If the localization is successful, cuVSLAM will load the map in the memory. Otherwise, it will continue building a new map.

Both SaveMap and LoadMapAndLocalize can take some time to complete. Hence, they are designed to be asynchronous to avoid interfering with odometry calculations.

Coordinate Frames

This section describes the coordinate frames that are involved in the VisualSlamNode. The frames discussed below are oriented as follows:

  1. input_base_frame: The name of the frame used to calculate transformation between base link and left camera. The default value is empty (‘’), which means the value of base_frame_ will be used. If input_base_frame_ and base_frame_ are both empty, the left camera is assumed to be in the robot’s center.

  2. input_left_camera_frame: The frame associated with left eye of the stereo camera. Note that this is not the same as the optical frame. The default value is empty (‘’), which means the left camera is in the robot’s center and left_pose_right will be calculated from the CameraInfo message.

  3. input_right_camera_frame: The frame associated with the right eye of the stereo camera. Note that this is not the same as the optical frame. The default value is empty (‘’), which means left and right cameras have identity rotation and are horizontally aligned, so left_pose_right will be calculated from CameraInfo.

  4. input_imu_frame: The frame associated with the IMU sensor (if available). It is used to calculate left_pose_imu.

Repositories and Packages

The Isaac ROS implementations of this technology are available here: