Skip to content

[ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"

License

Notifications You must be signed in to change notification settings

IDEA-Research/TOSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

451a1d3 · May 5, 2024

History

13 Commits
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
May 5, 2024
Sep 22, 2023
May 5, 2024
May 5, 2024
Sep 22, 2023
May 5, 2024
May 5, 2024
Sep 22, 2023
May 5, 2024

Repository files navigation

TOSS: High-quality Text-guided Novel View Synthesis from a Single Image (ICLR2024)

Yukai Shi, Jianan Wang, He Cao, Boshi Tang, Xianbiao Qi, Tianyu Yang, Yukun Huang, Shilong Liu, Lei Zhang, Heung-Yeung Shum

Official implementation for TOSS: High-quality Text-guided Novel View Synthesis from a Single Image.

TOSS introduces text as high-level sementic information to constraint the NVS solution space for more controllable and more plausible results.

3d_generation_video.mp4

Install

Create environment

conda create -n toss python=3.9
conda activate toss

Install packages

pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
git clone https://github.com/openai/CLIP.git
pip install -e CLIP/

Weights

Download pretrain weights from this link to sub-directory ./ckpt

Inference

We suggest gradio for a visualized inference and test this demo on a single RTX3090.

python app.py

image

Todo List

  • Release inference code.
  • Release pretrained models.
  • Upload 3D generation code.
  • Upload training code.

Acknowledgement

Citation

@article{shi2023toss,
  title={Toss: High-quality text-guided novel view synthesis from a single image},
  author={Shi, Yukai and Wang, Jianan and Cao, He and Tang, Boshi and Qi, Xianbiao and Yang, Tianyu and Huang, Yukun and Liu, Shilong and Zhang, Lei and Shum, Heung-Yeung},
  journal={arXiv preprint arXiv:2310.10644},
  year={2023}
}

About

[ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages