📣 📣 📣 The paper is now available.
Point cloud registration is a prerequisite for many applications in computer vision and robotics. Most existing methods focus on the registration of point clouds with high overlap. While some learning-based methods address low overlap cases, they struggle in out-of-distribution scenarios with extremely low overlap ratios. This paper introduces a novel framework dubbed L-PR, designed to register unordered low overlap multiview point clouds leveraging LiDAR fiducial markers. We refer to them as LiDAR fiducial markers, but they are the same as the popular AprilTag and ArUco markers, thin sheets of paper that do not affect the 3D geometry of the environment. We first propose an improved adaptive threshold marker detection method to provide robust detection results when the viewpoints among point clouds change dramatically. Then, we formulate the unordered multiview point cloud registration problem as a maximum a posterior (MAP) problem and develop a framework consisting of two levels of graphs to address it. The first-level graph, constructed as a weighted graph, is designed to efficiently and optimally infer initial values of scan poses from the unordered set. The second level graph is constructed as a factor graph. By globally optimizing the variables on the graph, including scan poses, marker poses, and marker corner positions, we tackle the MAP problem. We perform both qualitative and quantitative experiments to demonstrate that the proposed method outperforms previous state-of-the-art (SOTA) methods in addressing challenging low overlap cases. Specifically, the proposed method can serve as a convenient, efficient, and low-cost tool for applications such as 3D asset collection from sparse scans, training data collection in unseen scenes, reconstruction of degraded scenes, and merging large-scale low overlap 3D maps, which existing methods struggle with. We also collect a new dataset named Livox-3DMatch using L-PR and incorporate it into the training of the SOTA learning-based methods, which brings evident improvements for them across various benchmarks.
We develop an improved adaptive threshold marker detection method to provide
robust detection
results when the viewpoints among point clouds change dramatically.
Given that the existing point cloud registration benchmark lacks fiducial markers in the scenes, we construct a new test dataset, as shown in the following figure, with the Livox MID-40. The competitors are MDGD(RA-L'24), SGHR(CVPR'23), SE3ET(RA-L'24), GeoTrans(TPAMI'23), and Teaser++(T-RO'20).
- Ground Truth (Mercedes-Benz GLB).
In the following, the point cloud is normalized into a unit sphere (i.e. [-1,1]). The first metric is the Chamfer Distance (CD). Given two point sets, the CD is the sum of the squared distance of each point to the nearest point in the other point set:
That is, a smaller CD value indicates a higher fidelity.
The second metric is the recall of the ground truth points from the reconstructed shape, which is defined as:
A higher Recall indicates a higher fidelity. - Ours.
CD: 0.0030. Recall: 96.22%
We collect a new training dataset called Livox3DMatch using the proposed L-PR. Livox-3DMatch augments the original 3DMatch training data from 14,400 pairs to 17,700 pairs (a 22.91% increase). By training on this augmented dataset, the performance of SGHR is improved by 2.90% on 3DMatch, 4.29% on ETH, and 22.72% (translation) / 11.19% (rotation) on ScanNet. To reproduce the results in the above table, we recommend reproducing SGHR first, including its training and testing. Then, download our Livox3DMatch dataset (it is already mixed with the original 3DMatch_train) and extract the folder into ./data. Rename the file to 3DMatch_train if you do not want to make any modifications to SGHR. Finally, train and test the model. Before running Train.py, remember to replace the dataset.py with our dataset.py. Note that the SGHR weight (trained on 3DMatch_train only) is already downloaded when you git clone SGHR. The model trained on Livox3DMatch, named yoho_my, is available in this repository. You could simply replace the original weights with ours and do a quick test.
The competitor is An Efficient Visual SfM Framework Using Planar Markers (SfM-M). This scenario has repetitive structures and weak geometric features. We attach thirteen 16.4 cm x 16.4 cm AprilTags to the wall. The LiDAR scans the scene from 11 viewpoints. We also captured 72 images with an iPhone 13 to use as input for SfM-M. The ground truth trajectories are given by an OptiTrack Motion Capture system. The proposed approach achieves better localization accuracy, which is expected given that LiDAR is a ranging sensor.
You need to apply the this algortihm to localize fiducials on a 3D map.
-
GTSAM
First, please ensure you can run the python demos given by GTSAM.
We found that GTSAM cannot be installed appropriately in a conda environment. Thus, a conda environment is not recommended. Otherwise, errors will be reported when you use BearingRangeFactor3D or BetweenFactorPose3. -
IILFM
A light and insertable version of IILFM is included in the files. To build and run it, you need to install basic tools like cmake.
If you'd like to give it a quick try, just follow the commands below. However, please note that the default settings correspond to the test reconstructing the scene of a lab from 11 frames with added AprilTags. (i.e. the default detector is an AprilTag detector and the marker size is 16.4cm × 16.4cm). If you want to try the ArUco demo (marker size: 69.2cm × 69.2cm), which involves reconstructing a vehicle, you will need to adjust the settings of the detector and marker size as guided by the readme iniilfm
. Also, do not forget to change in marker size inmain.py
. -
Python Packages
The following packages are required.
AprilTag
opencv-python
networkx
open3d
scipy
matplotlib
opencv-python(cv2) has a built-in ArUco detector. Please ensure you can run the python demos of ArUco detection. Again, a conda environment is not recommended.
- The extracted point clouds of vehicles and the script to run the quantitative evaluation are available here.
- All the LiDAR scans used in this work are available here.
- You need to download the LiDAR scans, rename the interested folder name to 'pc', and put it in the same level directory as the 'main.py'. Please read the instructions of IILFM carefully to ensure that you are using the correct scripts, as some scans contain AprilTag while others contain ArUco.
- Raw rosbag of the instance reconstruction evaluation.
- Raw rosbag collected in the degraded scene.
git clone https://github.com/yorklyb/LiDAR-SFM.git
cd L-PR
cd lpr
cd iilfm
mkdir build
cd build
cmake ..
make
mv tag_detection ../../
Ensure that the point clouds are named '1.pcd', '2.pcd', etc., and are placed into a folder named 'pc'. The 'pc' folder should be located in the same level directory as the 'main.py' file.
python3 main.py
After processing all the point clouds, you will see a graph plot. Close it by pressing 'q' or using the close button. Finally, an output file named 'out.pcd' will be generated.
First, you need to record the rostopic of the point cloud as rosbags. If you are using Livox MID-40, run rosbag record /livox/lidar
in the terminal while the Livox-ros-driver is running. Then, assume that you placed the LiDAR at N viewpoints and obtained N rosbags. You need to put all of them in a folder named 'data'. Check this out as an example.
Suppose that you have done git clone https://github.com/yorklyb/LiDAR-SFM.git
cd L-PR
cd merge
mkdir build
cd build
cmake ..
make
Put process.py
into build. Also, put 'data' into build.
Open a terminal and run roscore
.
Open a new terminal and run python3 process.py
. You will find the rosbags are transformed into pcd files in the folder 'processed'. Rename the folder as 'pc'.
If you are using other LiDAR models, you need to change the rostopic name when recording rosbags using rosbag record your_topic_name
and also update the topic in process.py
accordingly.
We would like to express our gratitude to Shiqi Li for helping us reproduce MDGD, Haiping Wang for helping us reproduce SGHR, Xin Zheng for helping us reproduce Traj LO, and Jie Li and Hao Fan for helping us reproduce An Efficient Visual SfM Framework Using Planar Markers (SfM-M). We also thank Honpei Yin and Jiahe Cui for helping us reproduce LOAM Livox, and Hao Wang, Yida Zang, Hunter Schofield, and Hassan Alkomy for their assistance in experiments. Additionally, we are grateful to Han Wang, Binbin Xu, Yuan Ren, Jianping Li, Yicong Fu, and Brian Lynch for constructive discussions.