This reporsitory contains codes for paper Causal Transformer for Fusion and Pose Estimation in Deep Visual Inertial Odometry.
Make sure you installed correct PyTorch version for your specific development environment.
Other requirements can be installed via pip
.
# clone project
git clone https://github.com/ybkurt/VIFT
cd VIFT
# [OPTIONAL] create conda environment
conda create -n vift python=3.9
conda activate vift
# install pytorch according to instructions
# https://pytorch.org/get-started/
# install requirements
pip install -r requirements.txt
cd data
sh data_prep.sh
This script will put the KITTI dataset under data/kitti_data
folder.
You need to use configs/data/kitti_vio.yaml
for dataloading.
You need to use configs/model/vio.yaml
to train models with KITTI dataset.
You can change the net
field in config to target your nn.Module
object. The requirements are as follows for proper functionality:
- See
src/models/components/vio_simple_dense_net.py
for an examplenn.Module
object consisting of multiple linear layers and ReLU activations.
We use pretrained image and IMU encoders of Visual-Selective-VIO model. Download the model weights from repository and put them under the pretrained_models
directory.
We use pretrained visual and inertial encoders from Visual_Selective_VIO to save the latent vectors for KITTI dataset.
cd data
python latent_caching.py
python latent_val_caching.py
This script will put the latent training KITTI dataset under data/kitti_latent_data
folder.
You need to use configs/data/kitti_vio.yaml
for dataloading.
You need to use configs/model/vio.yaml
to train models with KITTI dataset.
You can change the net
field in config to target your nn.Module
object.
- See
src/models/components/vio_simple_dense_net.py
for an examplenn.Module
object consisting of multiple linear layers and ReLU activations.
After saving latents for KITTI dataset, you can run following command to run the experiments in the paper.
sh scripts/schedule_paper.sh
Train model with default configuration
# train on GPU
python src/train.py trainer=gpu
Train model with chosen experiment configuration from configs/experiment/
python src/train.py experiment=experiment_name.yaml trainer=gpu
You can override any parameter from command line like this
python src/train.py trainer.max_epochs=20 data.batch_size=64
This project makes use of code from the following open-source projects:
- RPMG: License: CC BY-NC 4.0
- Visual Selective VIO
We are grateful to the authors and contributors of these projects for their valuable work.
If you find our work useful in your research, please consider citing:
@article{kurt2024vift,
title={Causal Transformer for Fusion and Pose Estimation in Deep Visual Inertial Odometry},
author={Kurt, Yunus Bilge and Akman, Ahmet and Alatan, Ayd{\i}n},
journal={arXiv preprint arXiv:2409.08769},
year={2024}
}