Performance benchmarking for NVIDIA-accelerated Isaac ROS packages.
Isaac ROS Benchmark builds upon the ros2_benchmark to provide configurations to benchmark Isaac ROS graphs. Performance results that measure Isaac ROS for throughput, latency, and utilization enable robotics developers to make informed decisions when designing real-time robotics applications. The Isaac ROS performance results can be independently verified, as the method, configuration, and data input used for benchmarking are provided.
The ros2_benchmark
playback node plug-in, for type adaptation and
negotiation, is provided for
NITROS, which
optimizes the performance of message transport costs through
RCL with GPU accelerated graphs of
nodes.
The datasets for benchmarking are explicitly not downloaded by default. To pull down the standardized benchmark datasets, refer to the ros2_benchmark Dataset section.
Please visit the Isaac ROS Documentation to learn how to use this repository.
Update 2024-09-26: Updated for Isaac ROS 3.1
Node |
Input Size |
AGX Orin |
Orin NX |
Orin Nano 8GB |
x86_64 w/ RTX 4060 Ti |
x86_64 w/ RTX 4090 |
---|---|---|---|---|---|---|
AprilTag Node |
720p |
244 fps 7.3 ms @ 30Hz |
114 fps 12 ms @ 30Hz |
79.2 fps 18 ms @ 30Hz |
596 fps 2.4 ms @ 30Hz |
596 fps 2.1 ms @ 30Hz |
Freespace Segmentation Node |
576p |
2190 fps 1.4 ms @ 30Hz |
1850 fps 1.8 ms @ 30Hz |
1220 fps 2.6 ms @ 30Hz |
3500 fps 0.50 ms @ 30Hz |
3500 fps 0.48 ms @ 30Hz |
Depth Segmentation Node |
576p |
45.9 fps 76 ms @ 30Hz |
28.8 fps 92 ms @ 30Hz |
– |
87.9 fps 35 ms @ 30Hz |
104 fps 35 ms @ 30Hz |
FoundationPose Pose Estimation Node |
720p |
1.72 fps 690 ms @ 30Hz |
– |
– |
– |
7.02 fps 170 ms @ 30Hz |
DNN Stereo Disparity Node Full |
576p |
96.5 fps 13 ms @ 30Hz |
41.2 fps 27 ms @ 30Hz |
– |
224 fps 5.5 ms @ 30Hz |
350 fps 2.4 ms @ 30Hz |
DNN Stereo Disparity Node Light |
288p |
276 fps 5.9 ms @ 30Hz |
134 fps 10 ms @ 30Hz |
– |
350 fps 2.4 ms @ 30Hz |
350 fps 1.7 ms @ 30Hz |
Stereo Disparity Node |
1080p |
168 fps 7.5 ms @ 30Hz |
75.4 fps 15 ms @ 30Hz |
51.5 fps 22 ms @ 30Hz |
350 fps 3.4 ms @ 30Hz |
814 fps 1.8 ms @ 30Hz |
Rectify Node |
1080p |
983 fps 2.5 ms @ 30Hz |
569 fps 3.5 ms @ 30Hz |
394 fps 5.2 ms @ 30Hz |
2500 fps 0.88 ms @ 30Hz |
2500 fps 0.66 ms @ 30Hz |
TensorRT Node DOPE |
VGA |
48.1 fps 24 ms @ 30Hz |
17.9 fps 56 ms @ 30Hz |
13.1 fps 82 ms @ 30Hz |
98.3 fps 13 ms @ 30Hz |
296 fps 5.1 ms @ 30Hz |
Triton Node DOPE |
VGA |
47.2 fps 23 ms @ 30Hz |
20.4 fps 540 ms @ 30Hz |
14.4 fps 790 ms @ 30Hz |
94.2 fps 12 ms @ 30Hz |
254 fps 4.6 ms @ 30Hz |
TensorRT Node PeopleSemSegNet |
544p |
460 fps 4.1 ms @ 30Hz |
348 fps 6.1 ms @ 30Hz |
238 fps 7.0 ms @ 30Hz |
685 fps 2.9 ms @ 30Hz |
675 fps 3.0 ms @ 30Hz |
Triton Node PeopleSemSegNet |
544p |
304 fps 4.8 ms @ 30Hz |
206 fps 6.5 ms @ 30Hz |
– |
677 fps 2.2 ms @ 30Hz |
619 fps 1.9 ms @ 30Hz |
DNN Image Encoder Node |
VGA |
522 fps 12 ms @ 30Hz |
330 fps 12 ms @ 30Hz |
– |
811 fps 6.6 ms @ 30Hz |
822 fps 6.4 ms @ 30Hz |
Occupancy Grid Localizer Node |
~50 sq. m |
19.5 fps 57 ms @ 30Hz |
8.34 fps 130 ms @ 30Hz |
5.75 fps 190 ms @ 30Hz |
50.1 fps 21 ms @ 30Hz |
50.1 fps 12 ms @ 30Hz |
H.264 Decoder Node |
1080p |
198 fps 8.1 ms @ 30Hz |
– |
– |
596 fps 3.8 ms @ 30Hz |
596 fps 4.3 ms @ 30Hz |
H.264 Encoder Node I-frame Support |
1080p |
406 fps 12 ms @ 30Hz |
– |
– |
425 fps 3.3 ms @ 30Hz |
409 fps 3.2 ms @ 30Hz |
H.264 Encoder Node P-frame Support |
1080p |
473 fps 9.1 ms @ 30Hz |
– |
– |
596 fps 2.3 ms @ 30Hz |
596 fps 2.1 ms @ 30Hz |
Nvblox Node |
– |
4.90 fps 77.1 ms |
4.97 fps 151 ms |
4.92 fps 91.2 ms |
4.97 fps 85.3 ms |
4.94 fps 64.2 ms |
Graph |
Input Size |
AGX Orin |
Orin NX |
Orin Nano 8GB |
x86_64 w/ RTX 4060 Ti |
x86_64 w/ RTX 4090 |
---|---|---|---|---|---|---|
AprilTag Graph |
720p |
241 fps 9.5 ms @ 30Hz |
109 fps 15 ms @ 30Hz |
74.3 fps 21 ms @ 30Hz |
596 fps 3.4 ms @ 30Hz |
596 fps 2.9 ms @ 30Hz |
Freespace Segmentation Graph |
576p |
42.1 fps 77 ms @ 30Hz |
27.4 fps 99 ms @ 30Hz |
21.4 fps 100 ms @ 30Hz |
88.6 fps 32 ms @ 30Hz |
105 fps 38 ms @ 30Hz |
Centerpose Pose Estimation Graph |
VGA |
36.3 fps 4.8 ms @ 30Hz |
19.7 fps 4.9 ms @ 30Hz |
13.8 fps 7.4 ms @ 30Hz |
50.2 fps 23 ms @ 30Hz |
50.2 fps 20 ms @ 30Hz |
DOPE Pose Estimation Graph |
VGA |
41.3 fps 42 ms @ 30Hz |
17.5 fps 76 ms @ 30Hz |
– |
85.2 fps 24 ms @ 30Hz |
199 fps 14 ms @ 30Hz |
DNN Stereo Disparity Graph Full |
576p |
89.4 fps 5.4 ms @ 30Hz |
36.8 fps 36 ms @ 30Hz |
– |
215 fps 3.7 ms @ 30Hz |
350 fps 5.7 ms @ 30Hz |
DNN Stereo Disparity Graph Light |
288p |
247 fps 5.9 ms @ 30Hz |
122 fps 8.5 ms @ 30Hz |
– |
350 fps 6.1 ms @ 30Hz |
350 fps 5.6 ms @ 30Hz |
Stereo Disparity Graph |
1080p |
157 fps 12 ms @ 30Hz |
72.2 fps 20 ms @ 30Hz |
49.6 fps 28 ms @ 30Hz |
349 fps 2.5 ms @ 30Hz |
791 fps 2.7 ms @ 30Hz |
DetectNet Object Detection Graph |
544p |
165 fps 20 ms @ 30Hz |
115 fps 26 ms @ 30Hz |
63.2 fps 36 ms @ 30Hz |
488 fps 10 ms @ 30Hz |
589 fps 10 ms @ 30Hz |
RT-DETR Object Detection Graph SyntheticaDETR |
720p |
71.9 fps 24 ms @ 30Hz |
30.8 fps 41 ms @ 30Hz |
21.3 fps 61 ms @ 30Hz |
205 fps 8.7 ms @ 30Hz |
400 fps 6.3 ms @ 30Hz |
TensorRT Graph PeopleSemSegNet |
544p |
371 fps 19 ms @ 30Hz |
250 fps 20 ms @ 30Hz |
163 fps 23 ms @ 30Hz |
670 fps 11 ms @ 30Hz |
688 fps 9.3 ms @ 30Hz |
SAM Image Segmentation Graph Full SAM |
720p |
2.22 fps 470 ms @ 30Hz |
– |
– |
– |
14.6 fps 79 ms @ 30Hz |
SAM Image Segmentation Graph Mobile SAM |
720p |
10.8 fps 880 ms @ 30Hz |
5.13 fps 1500 ms @ 30Hz |
2.22 fps 360 ms @ 30Hz |
27.0 fps 62 ms @ 30Hz |
60.3 fps 27 ms @ 30Hz |
Live Graph |
Input Size |
Nova Carter |
---|---|---|
Data Recorder Live Graph 4 Hawk Cameras |
1200p |
30.3 fps (per stream avg) 0 dropped frames (avg) |
Multicam Visual SLAM Live Graph 4 Hawk Cameras |
1200p |
30.0 fps |
DNN Stereo Disparity Live Graph 3 Hawk Cameras 1x Full ESS and 2x Throttled Light ESS |
1200p |
Full: 30.2 fps Light: 15.2 fps (avg) |
Perceptor Graph 3 Hawk Cameras |
1200p |
Nvblox ESDF: 9.46 fps Nvblox Mesh: 9.77 fps Visual Odometry: 30.1 fps |