Zeqi Xiao1
Wenqi Ouyang1
Yifan Zhou1
Shuai Yang2
Lei Yang3
Jianlou Si3
Xingang Pan1
1S-Lab, Nanyang Technological University,
2Wangxuan Institute of Computer Technology, Peking University,
3Sensetime Research
demo.mp4
- Create a Conda Environment
This codebase is tested with the versions of PyTorch 1.13.1+cu117.
conda create -n trajattn python==3.10
conda activate trajattn
pip install -r requirements.txt
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
-
Download model weights Download model weights from huggingface.
-
Clone Relevant Repositories and Download Checkpoints
# Clone the Depth-Anything-V2 repository
git clone https://github.com/DepthAnything/Depth-Anything-V2
# Download the Depth-Anything-V2-Large checkpoint
wget https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth?download=true
Save the checkpoints to the checkpoints/
directory. You can also modify the checkpoint path in the running scripts if needed.
To control camera motion on images, execute the following script
sh image_control.sh
- Release models and weight;
- Release pipelines for single image camera motion control;
- Release pipelines for video camera motion control;
- Release pipelines for video editing;
- Release training pipeline
If you find our work helpful, please cite:
@misc{xiao2024trajectoryattentionfinegrainedvideo,
title={Trajectory Attention for Fine-grained Video Motion Control},
author={Zeqi Xiao and Wenqi Ouyang and Yifan Zhou and Shuai Yang and Lei Yang and Jianlou Si and Xingang Pan},
year={2024},
eprint={2411.19324},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.19324},
}
- SVD: Our model is tuned from SVD.
- MiraData: We use the data collected by MiraData.
- Depth-Anything-V2: We estimate depth map by Depth-Anything-V2.
- Unimatch: We estimate optical flow map by Unimatch.
- Cotracker: We estimate point trajectories by Cotracker.
- NVS_Solver: Our camera rendering code is based on NVS_Solver.