Skip to content

NVIDIA-ISAAC-ROS/isaac_ros_dnn_stereo_depth

Repository files navigation

Isaac ROS DNN Stereo Depth

NVIDIA-accelerated, deep learned stereo disparity estimation

image

Webinar Available

Learn how to use this package by watching our on-demand webinar: Using ML Models in ROS 2 to Robustly Estimate Distance to Obstacles


Overview

The vision depth perception problem is generally useful in many fields of robotics such as estimating the pose of a robotic arm in an object manipulation task, estimating distance of static or moving targets in autonomous robot navigation, tracking targets in delivery robots and so on. Isaac ROS DNN Stereo Depth is targeted at two Isaac applications, Isaac Manipulator and Isaac Perceptor. In Isaac Manipulator application, ESS is deployed in Isaac ROS cuMotion package as a plug-in node to provide depth perception maps for robot arm motion planning and control. In this scenario, multi-camera stereo streams of industrial robot arms on a table task are passed to ESS to obtain corresponding depth streams. The depth streams are used to segment the relative distance of robot arms from corresponding objects on the table; thus providing signals for collision avoidance and fine-grain control. Similarly, the Isaac Perceptor application uses several Isaac ROS packages, namely, Isaac ROS Nova, Isaac ROS Visual Slam, Isaac ROS Stereo Depth (ESS), Isaac ROS Nvblox and Isaac ROS Image Pipeline.

ESS is deployed in Isaac Perceptor to enable Nvblox to create 3D voxelized images of the robot surroundings. Specifically, the Nova developer suite provides 3x stereo-camera streams to Isaac Perceptor. Each stream corresponds to the front, left, and right cameras. In both Isaac Manipulator and Isaac Perceptor, a camera-specific image processing pipeline consisting of GPU-accelerated operations, provides rectification and undistortion of the input stereo images. All stereo stream image pair are time synchronized before before passing them to ESS. ESS node outputs corresponding depth maps for all three preprocessed image streams and combines the depth images with motion signals provided by cuVSLAM module. The combined depth and motion integrated signals are fed to Nvblox module to produce a dense 3D volumetric scene reconstruction of the surrounding scene.

image

Above, ESS node is used in a graph of nodes to provide a disparity prediction from an input left and right stereo image pair. The rectify and resize nodes pre-process the left and right frames to the appropriate resolution. The aspect ratio of the image is recommended to be maintained to avoid degrading the depth output quality. The graph for DNN encode, DNN inference, and DNN decode is included in the ESS node. Inference is performed using TensorRT, as the ESS DNN model is designed with optimizations supported by TensorRT.

Isaac ROS NITROS Acceleration

This package is powered by NVIDIA Isaac Transport for ROS (NITROS), which leverages type adaptation and negotiation to optimize message formats and dramatically accelerate communication between participating nodes.

Performance

Sample Graph

Input Size

AGX Orin

Orin NX

x86_64 w/ RTX 4090

DNN Stereo Disparity Node


Full

576p



103 fps


12 ms @ 30Hz

42.1 fps


26 ms @ 30Hz

350 fps


2.3 ms @ 30Hz

DNN Stereo Disparity Node


Light

288p



306 fps


5.6 ms @ 30Hz

143 fps


9.4 ms @ 30Hz

350 fps


1.6 ms @ 30Hz

DNN Stereo Disparity Graph


Full

576p



33.5 fps


25 ms @ 30Hz

35.2 fps


34 ms @ 30Hz

350 fps


5.6 ms @ 30Hz

DNN Stereo Disparity Graph


Light

288p



179 fps


14 ms @ 30Hz

126 fps


15 ms @ 30Hz

350 fps


4.4 ms @ 30Hz


Documentation

Please visit the Isaac ROS Documentation to learn how to use this repository.


Packages

Latest

Update 2024-09-26: Updated for ESS 4.1 trained on additional samples