Skip to content

Introducing a breakthrough real-time 3D reconstruction method for 360° action cams on small UAVs, enhancing StellaVSLAM with PatchMatch-Stereo for equirectangular video, optimized for low latency, and delivering superior accuracy on consumer-grade laptops with mobile GPUs.

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
BSD-2-Clause
LICENSE.fork
Unknown
LICENSE.original
Notifications You must be signed in to change notification settings

RoblabWh/stella_vslam_dense

 
 

Repository files navigation

stella_vslam_dense

Innovate your UAV-based USAR missions with our real-time dense 3D reconstruction method tailored for common 360° action cams. Building upon StellaVSLAM and ORB-SLAM, our approach introduces a PatchMatch-Stereo thread for dense correspondences, extending robust long-term localization on equirectangular video input. This GitHub repository hosts our novel massively parallel variant of the PatchMatch-Stereo algorithm, optimized for low latency and supporting the equirectangular camera model. Experience improved accuracy and completeness on consumer-grade laptops with recent mobile GPUs, surpassing traditional offline Multi-View-Stereo solutions in real-time dense 3D reconstruction tasks.

Website: https://roblabwh.github.io/stella_vslam_dense/
Original Paper: PatchMatch-Stereo-Panorama, a fast dense reconstruction from 360° video images

Limitations of Dense Feature

  • Only monocular equirectangular input is supported.
  • Only msgpack format is supported.
  • Visualization is limited to using socket_publisher exclusively.

Dependencies

Installation

git clone --recursive https://github.com/RoblabWh/stella_vslam_dense.git
cd stella_vslam_dense
docker build -t stella_vslam_dense -f Dockerfile.socket . --build-arg NUM_THREADS=$(nproc)

Running

Start VSLAM container

docker run -it --rm --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p 3001:3001 --name=stella_vslam_dense -v ${HOST_DATA_PATH:?Set path to directory on the host system to keep input and output data.}:/data stella_vslam_dense

Open the viewer in a browser: http://localhost:3001

Run VSLAM in container

./run_video_slam -v /data/orb_vocab.fbow -c "/data/${PATH_TO_CONFIG:?}" -m "/data/${PATH_TO_VIDEO:?}" --mask "/data/${PATH_TO_MASK:?}"

More options are available

./run_video_slam -h

Running our examples

We present two illustrative examples to evaluate our method. The first utilizes a high-definition (HD) equirectangular video and boasts real-time processing capabilities. In contrast, the second example employs a 5.7k video, yielding a point cloud of higher quality. To execute the first example seamlessly, a computer equipped with 16GB of RAM and an NVIDIA graphics card is required. However, for the second example, a more robust system with 32GB of RAM (in addition to an NVIDIA graphics card) is necessary.

HD / Real time example

Prepare dataset in container

mkdir -p /data/example_inspection_flight_drz_keyframes_hd

curl -L "https://github.com/stella-cv/FBoW_orb_vocab/raw/main/orb_vocab.fbow" -o /data/orb_vocab.fbow

curl -u "dF2wQbFNW2zuCpK:fire" "https://w-hs.sciebo.de/public.php/webdav/stella_vslam_dense/example_inspection_flight_drz_hd.mp4" -o /data/example_inspection_flight_drz_hd.mp4

Run VSLAM in container

./run_video_slam -v /data/orb_vocab.fbow -c /stella_vslam/example/dense/dense_hd.yaml -m /data/example_inspection_flight_drz_hd.mp4 --frame-skip 3 -o /data/example_inspection_flight_drz_hd.db -p /data/example_inspection_flight_drz_hd.ply -k /data/example_inspection_flight_drz_keyframes_hd/

High quality example

Prepare dataset in container

mkdir -p /data/example_inspection_flight_drz_keyframes

curl -L "https://github.com/stella-cv/FBoW_orb_vocab/raw/main/orb_vocab.fbow" -o /data/orb_vocab.fbow

curl -u "dF2wQbFNW2zuCpK:fire" "https://w-hs.sciebo.de/public.php/webdav/stella_vslam_dense/example_inspection_flight_drz.mp4" -o /data/example_inspection_flight_drz.mp4

Run VSLAM in container

./run_video_slam -v /data/orb_vocab.fbow -c /stella_vslam/example/dense/dense.yaml -m /data/example_inspection_flight_drz.mp4 --frame-skip 3 -o /data/example_inspection_flight_drz.db -p /data/example_inspection_flight_drz.ply -k /data/example_inspection_flight_drz_keyframes/

NeRF example

Prepare dataset in container

curl -L "https://github.com/stella-cv/FBoW_orb_vocab/raw/main/orb_vocab.fbow" -o /data/orb_vocab.fbow

curl -u "dF2wQbFNW2zuCpK:fire" "https://w-hs.sciebo.de/public.php/webdav/stella_vslam_dense/example_inspection_flight_drz.mp4" -o /data/example_inspection_flight_drz.mp4

Run VSLAM and export script in container

./run_video_slam -v /data/orb_vocab.fbow -c /stella_vslam/example/nerf/nerf.yaml -m /data/example_inspection_flight_drz.mp4 --frame-skip 10 -o /data/example_inspection_flight_drz_nerf.db --no-sleep --auto-term --viewer none

/stella_vslam/scripts/export_sqlite3_to_nerf.py /data/example_inspection_flight_drz_nerf.db /data/example_inspection_flight_drz_nerf/

Run nerfstudio on the host or in another container

Additional export scripts

Exporting the project from sqlite3 for use with nerfstudio

/stella_vslam/scripts/export_sqlite3_to_nerf.py ${PATH_TO_DB:?} ${PATH_TO_OUTPUT:?}

Exporting the project from msgpack for use with nerfstudio

/stella_vslam/scripts/export_msgpack_to_nerf.py ${PATH_TO_MSG:?} ${PATH_TO_OUTPUT:?}

Exporting point cloud from msgpack to ply

/stella_vslam/scripts/export_dense_msg_to_ply.py -i ${PATH_TO_MSG:?} -o ${PATH_TO_PLY:?}

Citation of original PatchMatch integration for OpenVSLAM

@INPROCEEDINGS{surmann2022,
  author={Surmann, Hartmut and Thurow, Marc and Slomma, Dominik},
  booktitle={2022 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR)},
  title={PatchMatch-Stereo-Panorama, a fast dense reconstruction from 360° video images},
  year={2022},
  volume={},
  number={},
  pages={366-372},
  doi={10.1109/SSRR56537.2022.10018698}}

The paper can be found here.
The code can be found here.

Further reading

This part is from the original StellaVSLAM readme, which remains fully functional for non-dense operations. Notably, the documentation related to the configuration YAML file is entirely applicable. It's important to emphasize that when utilizing the dense feature, only the socket_publisher, msgpack format, and monocular equirectangular inputs are supported.

stella_vslam

CI Documentation Status License


NOTE: This is a community fork of xdspacelab/openvslam. It was created to continue active development of OpenVSLAM on Jan 31, 2021. The original repository is no longer available. Please read the official statement of termination carefully and understand it before using this. The similarities with ORB_SLAM2 in the original version have been removed by #252. If you find any other issues with the license, please point them out. See #37 and #249 for discussion so far.

Versions earlier than 0.3 are deprecated. If you use them, please use them as a derivative of ORB_SLAM2 under the GPL license.


Overview

[PrePrint]

stella_vslam is a monocular, stereo, and RGBD visual SLAM system.

Features

The notable features are:

  • It is compatible with various type of camera models and can be easily customized for other camera models.
  • Created maps can be stored and loaded, then stella_vslam can localize new images based on the prebuilt maps.
  • The system is fully modular. It is designed by encapsulating several functions in separated components with easy-to-understand APIs.
  • We provided some code snippets to understand the core functionalities of this system.

One of the noteworthy features of stella_vslam is that the system can deal with various type of camera models, such as perspective, fisheye, and equirectangular. If needed, users can implement extra camera models (e.g. dual fisheye, catadioptric) with ease. For example, visual SLAM algorithm using equirectangular camera models (e.g. RICOH THETA series, insta360 series, etc) is shown above.

We provided documentation for installation and tutorial. The repository for the ROS wrapper is stella_vslam_ros.

Acknowledgements

OpenVSLAM is based on an indirect SLAM algorithm with sparse features, such as ORB-SLAM/ORB-SLAM2, ProSLAM, and UcoSLAM. The core architecture is based on ORB-SLAM/ORB-SLAM2 and the code has been redesigned and written from scratch to improve scalability, readability, performance, etc. UcoSLAM has implemented the parallelization of feature extraction, map storage and loading earlier. ProSLAM has implemented a highly modular and easily understood system earlier.

Examples

Some code snippets to understand the core functionalities of the system are provided. You can employ these snippets for in your own programs. Please see the *.cc files in ./example directory or check Simple Tutorial and Example.

Installation

Please see Installation chapter in the documentation.

The instructions for Docker users are also provided.

Tutorial

Please see Simple Tutorial chapter in the documentation.

A sample ORB vocabulary file can be downloaded from here. Sample datasets are also provided at here.

If you would like to run visual SLAM with standard benchmarking datasets (e.g. KITTI Odometry dataset), please see SLAM with standard datasets section in the documentation.

Community

Please contact us via GitHub Discussions if you have any questions or notice any bugs about the software.

Currently working on

  • Refactoring
  • Algorithm changes and parameter additions to improve performance
  • Add tests
  • Marker integration
  • Implementation of extra camera models
  • Python bindings
  • IMU integration

The higher up the list, the higher the priority. Feedbacks, feature requests, and contribution are welcome!

License

2-clause BSD license (see LICENSE)

The following files are derived from third-party libraries.

Please use g2o as the dynamic link library because csparse_extension module of g2o is LGPLv3+.

Authors of the original version of OpenVSLAM

Citation of original version of OpenVSLAM

OpenVSLAM won first place at ACM Multimedia 2019 Open Source Software Competition.

If OpenVSLAM helps your research, please cite the paper for OpenVSLAM. Here is a BibTeX entry:

@inproceedings{openvslam2019,
  author = {Sumikura, Shinya and Shibuya, Mikiya and Sakurada, Ken},
  title = {{OpenVSLAM: A Versatile Visual SLAM Framework}},
  booktitle = {Proceedings of the 27th ACM International Conference on Multimedia},
  series = {MM '19},
  year = {2019},
  isbn = {978-1-4503-6889-6},
  location = {Nice, France},
  pages = {2292--2295},
  numpages = {4},
  url = {http://doi.acm.org/10.1145/3343031.3350539},
  doi = {10.1145/3343031.3350539},
  acmid = {3350539},
  publisher = {ACM},
  address = {New York, NY, USA}
}

The preprint can be found here.

Reference

  • Raúl Mur-Artal, J. M. M. Montiel, and Juan D. Tardós. 2015. ORB-SLAM: a Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics 31, 5 (2015), 1147–1163.
  • Raúl Mur-Artal and Juan D. Tardós. 2017. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. IEEE Transactions on Robotics 33, 5 (2017), 1255–1262.
  • Dominik Schlegel, Mirco Colosi, and Giorgio Grisetti. 2018. ProSLAM: Graph SLAM from a Programmer’s Perspective. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA). 1–9.
  • Rafael Muñoz-Salinas and Rafael Medina Carnicer. 2019. UcoSLAM: Simultaneous Localization and Mapping by Fusion of KeyPoints and Squared Planar Markers. arXiv:1902.03729.
  • Mapillary AB. 2019. OpenSfM. https://github.com/mapillary/OpenSfM.
  • Giorgio Grisetti, Rainer Kümmerle, Cyrill Stachniss, and Wolfram Burgard. 2010. A Tutorial on Graph-Based SLAM. IEEE Transactions on Intelligent Transportation SystemsMagazine 2, 4 (2010), 31–43.
  • Rainer Kümmerle, Giorgio Grisetti, Hauke Strasdat, Kurt Konolige, and Wolfram Burgard. 2011. g2o: A general framework for graph optimization. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA). 3607–3613.

About

Introducing a breakthrough real-time 3D reconstruction method for 360° action cams on small UAVs, enhancing StellaVSLAM with PatchMatch-Stereo for equirectangular video, optimized for low latency, and delivering superior accuracy on consumer-grade laptops with mobile GPUs.

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
BSD-2-Clause
LICENSE.fork
Unknown
LICENSE.original

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 93.6%
  • CMake 4.1%
  • Python 1.1%
  • Dockerfile 0.6%
  • C 0.3%
  • Shell 0.2%
  • Batchfile 0.1%