[AAAI 2024] FFT-Based Dynamic Token Mixer for Vision

Created by

This code is the official implementation of DFFormer and CDFFormer.

Caution

Recently I found an error in figures. I will correct the arXiv version in the near future. However, the published version cannot be revised. In Figure 5 (Throughput vs. resolution) and 6 (Peak memory vs. resolution), correctly, pink is DFFormer, blue is CDFFormer, and purple is GFFormer.

Usage

Requirements

torch==1.12.1
torchvision==0.13.1
timm==0.5.4
Pillow
etc. (see requirements.txt)

Data preparation

The ImageNet dataset should be downloaded and extracted with a directory structure as specified.

imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

Classification Training

Single-node

./distributed_train.sh 8 /path/to/imagenet --model dfformer_s18 -b 128 -j 8 --opt adamw --epochs 300 --sched cosine --native-amp --img-size 224 --drop-path 0.2 --lr 1e-3 --weight-decay 0.05 --aa rand-m9-mstd0.5-inc1 --smoothing 0.1 --mixup 0.8 --cutmix 1.0 --reprob 0.25 --warmup-lr 1e-6 --warmup-epochs 20 --experiment DFFormer --task-name dfformer_s18

Multi-node

Please use mpirun, depending on your environment.

Segmentation Training

Using pre-trained weights is necessary, so it must be set in advance.

bash segmentation/tools/dist_train.sh \
    segmentation/configs/fpn/dfformer_s18_fpn.py 8 \
    --work-dir work_dir/dfformer_s18_fpn --seed 42 --deterministic

Object Detection Training

Using pre-trained weights is necessary, so it must be set in advance.

FORK_LAST3=1 bash detection/tools/dist_train.sh \
    detection/configs/retinanet/dfformer_m36_retinanet.py 8 \
    --work-dir work_dir/dfformer_m36_retinanet --seed 42 --deterministic

Acknowledgment

Our implementation is based on MetaFormer Baselines for Vision, pytorch-image-models, mmsegmentation, and mmdetection.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
detection		detection
mmcv_custom/runner		mmcv_custom/runner
mmdet_custom/apis		mmdet_custom/apis
models		models
segmentation		segmentation
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
avg_checkpoints.py		avg_checkpoints.py
benchmark.py		benchmark.py
clean_checkpoint.py		clean_checkpoint.py
distributed_train.sh		distributed_train.sh
distributed_validate.sh		distributed_validate.sh
hubconf.py		hubconf.py
requirements.txt		requirements.txt
train.py		train.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[AAAI 2024] FFT-Based Dynamic Token Mixer for Vision

Caution

Usage

Requirements

Data preparation

Classification Training

Segmentation Training

Object Detection Training

Acknowledgment

About

Releases 1

Packages

Languages

License

okojoalg/dfformer

Folders and files

Latest commit

History

Repository files navigation

[AAAI 2024] FFT-Based Dynamic Token Mixer for Vision

Caution

Usage

Requirements

Data preparation

Classification Training

Segmentation Training

Object Detection Training

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages