The code for our paper for RAL/IROS 2022:
OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition. [paper]
OverlapTransformer (OT) is a novel lightweight neural network exploiting the LiDAR range images to achieve fast execution with less than 4 ms per frame using python, less than 2 ms per frame using C++ in LiDAR similarity estimation. It is a newer version of our previous OverlapNet, which is faster and more accurate in LiDAR-based loop closure detection and place recognition.
Developed by Junyi Ma, Xieyuanli Chen and Jun Zhang.
OverlapTransformer is not a sophisticated model but holds natural mathematical properties in a lightweight style for surround-view observations. Welcome to post results in issues if you have tried other input types (e.g., RGBD camera, Livox, 16/32-beam LiDAR).
[2023-09] The multi-view extension of OT, CVTNet, is accepted by IEEE Transactions on Industrial Informatics (TII)! A better long-term recognition performance is available ⭐
[2022-12] SeqOT is accepted by IEEE Transactions on Industrial Electronics (TIE)!
[2022-09] We further develop a sequence-enhanced version of OT named as SeqOT, which can be found here.
Fig. 1 An online demo for finding the top1 candidate with OverlapTransformer on sequence 1-1 (database) and 1-3 (query) of Haomo Dataset.
Fig. 2 Haomo Dataset which is collected by HAOMO.AI.
More details of Haomo Dataset can be found in dataset description (link).
- Introduction and Haomo Dataset
- Publication
- Dependencies
- How to Use
- Datasets Used by OT
- Related Work
- License
If you use the code or the Haomo dataset in your academic work, please cite our paper (PDF):
@ARTICLE{ma2022ral,
author={Ma, Junyi and Zhang, Jun and Xu, Jintao and Ai, Rui and Gu, Weihao and Chen, Xieyuanli},
journal={IEEE Robotics and Automation Letters},
title={OverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition},
year={2022},
volume={7},
number={3},
pages={6958-6965},
doi={10.1109/LRA.2022.3178797}}
We use pytorch-gpu for neural networks.
An nvidia GPU is needed for faster retrival. OverlapTransformer is also fast enough when using the neural network on CPU.
To use a GPU, first you need to install the nvidia driver and CUDA.
-
CUDA Installation guide: link
We use CUDA 11.3 in our work. Other versions of CUDA are also supported but you should choose the corresponding torch version in the following Torch dependences. -
System dependencies:
sudo apt-get update sudo apt-get install -y python3-pip python3-tk sudo -H pip3 install --upgrade pip
-
Torch dependences:
Following this link, you can download Torch dependences by pip:pip3 install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
or by conda:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
-
Other Python dependencies (may also work with different versions than mentioned in the requirements file):
sudo -H pip3 install -r requirements.txt
We provide a training and test tutorials for KITTI sequences in this repository. The tutorials for Haomo dataset will be released together with the complete Haomo dataset.
We recommend you follow our code and data structures as follows.
├── config
│ ├── config_haomo.yml
│ └── config.yml
├── modules
│ ├── loss.py
│ ├── netvlad.py
│ ├── overlap_transformer_haomo.py
│ └── overlap_transformer.py
├── test
│ ├── test_haomo_topn_prepare.py
│ ├── test_haomo_topn.py
│ ├── test_kitti00_prepare.py
│ ├── test_kitti00_PR.py
│ ├── test_kitti00_topN.py
│ ├── test_results_haomo
│ │ └── predicted_des_L2_dis_bet_traj_forward.npz (to be generated)
│ └── test_results_kitti
│ └── predicted_des_L2_dis.npz (to be generated)
├── tools
│ ├── read_all_sets.py
│ ├── read_samples_haomo.py
│ ├── read_samples.py
│ └── utils
│ ├── gen_depth_data.py
│ ├── split_train_val.py
│ └── utils.py
├── train
│ ├── training_overlap_transformer_haomo.py
│ └── training_overlap_transformer_kitti.py
├── valid
│ └── valid_seq.py
├── visualize
│ ├── des_list.npy
│ └── viz_haomo.py
└── weights
├── pretrained_overlap_transformer_haomo.pth.tar
└── pretrained_overlap_transformer.pth.tar
In the file config.yaml, the parameters of data_root
are described as follows:
data_root_folder (KITTI sequences root) follows:
├── 00
│ ├── depth_map
│ ├── 000000.png
│ ├── 000001.png
│ ├── 000002.png
│ ├── ...
│ └── overlaps
│ ├── train_set.npz
├── 01
├── 02
├── ...
├── 10
└── loop_gt_seq00_0.3overlap_inactive.npz
valid_scan_folder (KITTI sequence 02 velodyne) contains:
├── 000000.bin
├── 000001.bin
...
gt_valid_folder (KITTI sequence 02 computed overlaps) contains:
├── 02
│ ├── overlap_0.npy
│ ├── overlap_10.npy
...
You need to download or generate the following files and put them in the right positions of the structure above:
- You can find the groud truth for KITTI 00 here: loop_gt_seq00_0.3overlap_inactive.npz
- You can find
gt_valid_folder
for sequence 02 here. - Since the whole KITTI sequences need a large memory, we recommend you generate range images such as
00/depth_map/000000.png
by the preprocessing from Overlap_Localization or its C++ version, and we will not provide these images. Please note that in OverlapTransformer, the.png
images are used instead of.npy
files saved in Overlap_Localization. - More directly, you can generate
.png
range images by the script from OverlapNet updated by us. overlaps
folder of each sequence belowdata_root_folder
is provided by the authors of OverlapNet here. You should rename them totrain_set.npz
.
For a quick use, you could download our model pretrained on KITTI, and the following two files also should be downloaded :
- calib_file: calibration file from KITTI 00.
- poses_file: pose file from KITTI 00.
Then you should modify demo1_config
in the file config.yaml.
Run the demo by:
cd demo
python ./demo_compute_overlap_sim.py
You can see a query scan (000000.bin of KITTI 00) with a reprojected positive sample (000005.bin of KITTI 00) and a reprojected negative sample (000015.bin of KITTI 00), and the corresponding similarity.
Fig. 3 Demo for calculating overlap and similarity with our approach.
In the file config.yaml, training_seqs
are set for the KITTI sequences used for training.
You can start the training with
cd train
python ./training_overlap_transformer_kitti.py
You can resume from our pretrained model here for training.
Once a model has been trained , the performance of the network can be evaluated. Before testing, the parameters shoud be set in config.yaml
test_seqs
: sequence number for evaluation which is "00" in our work.test_weights
: path of the pretrained model.gt_file
: path of the ground truth file provided by the author of OverlapNet, which can be downloaded here.
Therefore you can start the testing scripts as follows:
cd test
mkdir test_results_kitti
python test_kitti00_prepare.py
python test_kitti00_PR.py
python test_kitti00_topN.py
After you run test_kitti00_prepare.py
, a file named predicted_des_L2_dis.npz
is generated in test_results_kitti
, which is used by python test_kitti00_PR.py
to calculate PR curve and F1max, and used by python test_kitti00_topN.py
to calculate topN recall.
For a quick test of the training and testing procedures, you could use our pretrained model.
Firstly, to visualize evaluation on KITTI 00 with search space, the follwoing three files should be downloaded:
- calib_file: calibration file from KITTI 00.
- poses_file: pose file from KITTI 00.
- cov_file: covariance file from SUMA++ on KITTI 00.
and modify the paths in the file config.yaml. Then
cd visualize
python viz_kitti.py
Fig. 4 Evaluation on KITTI 00 with search space from SuMa++ (a semantic LiDAR SLAM method).
We also provide a visualization demo for Haomo dataset after Haomo dataset is released (Fig. 1). Please download the descriptors of database (sequence 1-1 of Haomo dataset) firstly and then:
cd visualize
python viz_haomo.py
We provide a C++ implementation of OverlapTransformer with libtorch for faster retrival.
- Please download .pt and put it in the OT_libtorch folder.
- Before building, make sure that PCL exists in your environment.
- Here we use LibTorch for CUDA 11.3 (Pre-cxx11 ABI). Please modify the path of Torch_DIR in CMakeLists.txt.
- For more details of LibTorch installation , please check this website.
Then you can generate a descriptor of 000000.bin of KITTI 00 by
cd OT_libtorch/ws
mkdir build
cd build/
cmake ..
make -j6
./fast_ot
You can find our C++ OT can generate a decriptor with less than 2 ms per frame.
In this section, we list the files of different datasets used by OT for faster inquiry.
KITTI is used to validate the place recognition performance in our paper. Currently we have released all the necessary files for evaluation on KITTI.
- Pretrained model: pretrained_overlap_transformer.pth.tar
- Overlaps folder of each sequence for training: train_set_from_overlapnet.zip.
- Validation folder from sequence 02: computed_overlap_02.zip.
- The groud truth for sequence 00 for testing: loop_gt_seq00_0.3overlap_inactive.npz (You can follow this issue to generate this file yourself.)
- Calibration file from the orginal benchmark (00): calib.txt
- Pose file the orginal benchmark (00): 00.txt
- Covariance file from SUMA++ (00): covariance_2nd.txt
Ford is used to validate the generalization ability with zero-shot transferring in our paper. Currently we have released all the necessary preprocessed files of Ford except the code for the evaluation which is similar to KITTI. You just need to follow our existing scripts.
- The overlap-based groud truth for sequence 00 for testing: loop_gt_seq00_0.3overlap_inactive.npz
- The distance-based groud truth for sequence 00 for testing: loop_gt_seq00_10distance_inactive.npz
- Calibration file from the orginal benchmark (00): calib.txt
- Pose file the orginal benchmark (00): poses.txt
- Covariance file from SUMA++ (00): covariance_2nd.txt
You can find the detailed description of Haomo dataset here.
You can find our more recent LiDAR place recognition approaches below, which have better performance on larger time gaps.
- SeqOT: spatial-temporal network using sequential LiDAR data (IEEE TIE 2022)
@ARTICLE{ma2022tie,
author={Ma, Junyi and Chen, Xieyuanli and Xu, Jingyi and Xiong, Guangming},
journal={IEEE Transactions on Industrial Electronics},
title={SeqOT: A Spatial-Temporal Transformer Network for Place Recognition Using Sequential LiDAR Data},
year={2022},
doi={10.1109/TIE.2022.3229385}}
- CVTNet: cross-view Transformer network using RIVs and BEVs (IEEE TII 2023)
@ARTICLE{10273716,
author={Ma, Junyi and Xiong, Guangming and Xu, Jingyi and Chen, Xieyuanli},
journal={IEEE Transactions on Industrial Informatics},
title={CVTNet: A Cross-View Transformer Network for LiDAR-Based Place Recognition in Autonomous Driving Environments},
year={2023},
doi={10.1109/TII.2023.3313635}}
Copyright 2022, Junyi Ma, Xieyuanli Chen, Jun Zhang, HAOMO.AI Technology Co., Ltd., China.
This project is free software made available under the GPL v3.0 License. For details see the LICENSE file.