DeMT

This repo is the official implementation of "DeMT" as well as the follow-ups. It currently includes code and models for the following tasks:

Updates

02/10/2023

We will release the code of DeMT at the end of February.
Merged Code.
Released a series of models. Please look into the data scaling paper for more details.

02/07/2023

News:

The Thirty-Seventh Conference on Artificial Intelligence (AAAI2023) will be held in Washington, DC, USA., from February 7-14, 2023.

02/01/2023

DeMT got accepted by AAAI 2023.

Introduction

DeMT (the name DeMT stands for Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction) is initially described in arxiv, which is based on a simple and effective encoder-decoder architecture (i.e., deformable mixer encoder and task-aware transformer decoder). First, the deformable mixer encoder contains two types of operators: the channel-aware mixing operator leveraged to allow communication among different channels (i.e., efficient channel location mixing), and the spatial-aware deformable operator with deformable convolution applied to efficiently sample more informative spatial locations (i.e., deformed features). Second, the task-aware transformer decoder consists of the task interaction block and task query block. The former is applied to capture task interaction features via self-attention. The latter leverages the deformed features and task-interacted features to generate the corresponding task-specific feature through a query-based Transformer for corresponding task predictions.

DeMT achieves strong performance on PASCAL-Context (75.33 mIoU semantic segmentation and 63.11 mIoU Human Segmentation on test) and and NYUD-v2 semantic segmentation (54.34 mIoU on test), surpassing previous models by a large margin.

Main Results on ImageNet with Pretrained Models

DeMT on NYUD-v2 dataset

model	backbone	#params	FLOPs	SemSeg	Depth	Noemal	Boundary	model checkpopint	log
DeMT	HRNet-18	4.76M	22.07G	39.18	0.5922	20.21	76.4	Google Drive	log
DeMT	Swin-T	32.07M	100.70G	46.36	0.5871	20.60	76.9	Google Drive	log
DeMT(xd=2)	Swin-T	36.6M	-	47.45	0.5563	19.90	77.0	Google Drive	log
DeMT	Swin-S	53.03M	121.05G	51.50	0.5474	20.02	78.1	Google Drive	log
DeMT	Swin-B	90.9M	153.65G	54.34	0.5209	19.21	78.5	Google Drive	log
DeMT	Swin-L	201.64M	-G	56.94	0.5007	19.14	78.8	Google Drive	log

DeMT on PASCAL-Contex dataset

model	backbone	SemSeg	PartSeg	Sal	Normal	Boundary
DeMT	HRNet-18	59.23	57.93	83.93	14.02	69.80
DeMT	Swin-T	69.71	57.18	82.63	14.56	71.20
DeMT	Swin-S	72.01	58.96	83.20	14.57	72.10
DeMT	Swin-B	75.33	63.11	83.42	14.54	73.20

Citing DeMT multi-task method

@inproceedings{xyy2023DeMT,
  title={DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction},
  author={Xu, Yangyang and Yang, Yibo and Zhang, Lefei },
  booktitle={Proceedings of the The Thirty-Seventh Conference on Artificial Intelligence (AAAI)},
  year={2023}
}

Getting Started

Install

conda install pytorch==1.7.0 torchvision==0.8.1 cudatoolkit=10.1 -c pytorch
conda install pytorch-lightning==1.1.8 -c conda-forge
conda install opencv==4.4.0 -c conda-forge
conda install scikit-image==0.17.2

Data Prepare

wget https://data.vision.ee.ethz.ch/brdavid/atrc/NYUDv2.tar.gz
wget https://data.vision.ee.ethz.ch/brdavid/atrc/PASCALContext.tar.gz
tar xfvz ./NYUDv2.tar.gz 
tar xfvz ./PASCALContext.tar.gz

Train

To train DeMT model:

python ./src/main.py --cfg ./config/t-nyud/swin/siwn_t_DeMT.yaml --datamodule.data_dir $DATA_DIR --trainer.gpus 8

Evaluation

When the training is finished, the boundary predictions are saved in the following directory: ./logger/NYUD_xxx/version_x/edge_preds/ .
The evaluation of boundary detection use the MATLAB-based SEISM repository to obtain the optimal-dataset-scale-F-measure (odsF) scores.

Acknowledgement

This repository is based ATRC. Thanks to ATRC!

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
config		config
figures		figures
genotypes		genotypes
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeMT

Updates

Introduction

Main Results on ImageNet with Pretrained Models

Citing DeMT multi-task method

Getting Started

Acknowledgement

About

Releases

Packages

Languages

yangyangxu0/DeMT

Folders and files

Latest commit

History

Repository files navigation

DeMT

Updates

Introduction

Main Results on ImageNet with Pretrained Models

Citing DeMT multi-task method

Getting Started

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages