Modeling Image Composition for Complex Scene Generation

Official PyTorch implementation of of TwFA.
Modeling Image Composition for Complex Scene Generation (CVPR2022)
Zuopeng Yang, Daqing Liu, Chaoyue Wang, Jie Yang, Dacheng Tao

arXiv | BibTeX

Overview

The overview of the proposed Transformer with Focal Attention (TwFA) framework.

The illustration of different attention mechanisms with connectivity matrix.

Requirements

A suitable conda environment named twfa can be created and activated with:

conda env create -f environment.yaml
conda activate twfa

Data Preparation

COCO

Create a symlink data/coco containing the images from the 2017 split in train2017 and val2017, and their annotations in annotations. Files can be obtained from the COCO webpage.

VG

Create a symlink data/vg containing the images from Visual Genome. Files can be obtained from the VG webpage. Unzip the other annotations for VG in the dir data.

Sampling

COCO

Download the checkpoint (code: 5ipt) and place it into the dir pretrained/checkpoints. Then run the command:

python scripts/sample_coco.py --base configs/coco.yaml --save_path SAVE_DIR

VG

Download the checkpoint1 (code: 1gzu) or checkpoint2 (code: t1qv) and place it into the dir pretrained/checkpoints. Then run the command:

python scripts/sample_vg.py --base configs/VG_CONFIG_FILE --save_path SAVE_DIR

Training models

COCO

python main.py --base configs/coco.yaml -t True --gpus 0,1,2,3,4,5,6,7,

VG

python main.py --base configs/vg.yaml -t True --gpus 0,1,2,3,4,5,6,7,

Results

Compare different models

Acknowledgement

Huge thanks to the Taming-Transformers!

@misc{esser2020taming,
      title={Taming Transformers for High-Resolution Image Synthesis}, 
      author={Patrick Esser and Robin Rombach and Björn Ommer},
      year={2020},
      eprint={2012.09841},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

BibTeX

@inproceedings{yang2022modeling,
  title={Modeling image composition for complex scene generation},
  author={Yang, Zuopeng and Liu, Daqing and Wang, Chaoyue and Yang, Jie and Tao, Dacheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7764--7773},
  year={2022}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modeling Image Composition for Complex Scene Generation

Overview

Requirements

Data Preparation

COCO

VG

Sampling

COCO

VG

Training models

COCO

VG

Results

Acknowledgement

BibTeX

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
TwFA		TwFA
assets		assets
configs		configs
data		data
scripts		scripts
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
main.py		main.py

License

JohnDreamer/TwFA

Folders and files

Latest commit

History

Repository files navigation

Modeling Image Composition for Complex Scene Generation

Overview

Requirements

Data Preparation

COCO

VG

Sampling

COCO

VG

Training models

COCO

VG

Results

Acknowledgement

BibTeX

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages