Rohit Lal, Saketh Bachu, Yash Garg, Arindam Dutta, Calvin-Khang Ta, Dripta S. Raychaudhuri, Hannah Dela Cruz, M. Salman Asif, Amit K. Roy-Chowdhury
video_results_resized.mp4
Accurately estimating 3D human poses under severe occlusions is crucial for tasks like action recognition, gait analysis, and AR/VR. Current models struggle with heavy occlusions due to limited temporal context or prolonged occlusions across frames. To address this, we introduce STRIDE (Single-video TempoRally contInuous occlusion-robust 3D Pose Estimation), a novel Test-Time Training (TTT) approach that refines noisy initial pose estimates into accurate, temporally coherent predictions. STRIDE is model-agnostic and enhances robustness and temporal consistency using any off-the-shelf 3D pose estimator. Experiments on challenging datasets show STRIDE significantly outperforms single-image and video-based methods, especially under substantial occlusions.
If you need to run just the demo, please follow the following steps:
- Step 1. Register on SMPL-X website.
- Step 2. Register on MANO website.
- Step 3. Register on BEDLAM website.
- Step 4. Run the following script to fetch demo data. The script will need the username and password created in above steps.
Create a virtual environment and install all the requirements using environment.yml
(conda env) and requirements.txt
conda env create -f environment.yml
conda activate stride
pip install -r requirements.txt
bash fetch_demo_data.sh
Download the below files and place them at the location stride/checkpoint/latest_epoch.bin
mkdir -p stride/checkpoint/
gdown --id 1k3UxjfzfDSs8ts1Fff_fEgcXIDaZP-Ik
mv latest_epoch.bin stride/checkpoint/latest_epoch.bin
gdown --id 1OmaBCC3oBjii9Eewdhdgeo8VTgV3plcN
unzip utils.zip
rm utils.zip
If the above download fails, directly download from the Google Drive link and place it in the respective folders
Run the demo code for a sample video.
sh scripts/demo_stride.sh
You may have conflicting shared libraries. Running export LD_LIBRARY_PATH=""
before the above command may solve this issue.
@misc{lal2024stridesinglevideobasedtemporally,
title={STRIDE: Single-video based Temporally Continuous Occlusion-Robust 3D Pose Estimation},
author={Rohit Lal and Saketh Bachu and Yash Garg and Arindam Dutta and Calvin-Khang Ta and Dripta S. Raychaudhuri and Hannah Dela Cruz and M. Salman Asif and Amit K. Roy-Chowdhury},
year={2024},
eprint={2312.16221},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2312.16221},
}