MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

Minghan LI, Shuai LI, Wangmeng XAING, Lei ZHANG

[arXiv]

Updates

March 31, 2023: Trained models are released.
March 28, 2023: Code and paper are now available!

Installation

See installation instructions.

Getting Started

We provide a script train_net.py, that is made to train all the configs provided in MDQE.

Before training: To train a model with "train_net.py" on VIS, first setup the corresponding datasets following Preparing Datasets for MDQE.

Then download pretrained weights in the Model Zoo into the path 'pretrained/coco/*.pth', and run:

python train_net.py --num-gpus 8 \
  --config-file configs/R50_ovis_360.yaml

To evaluate a model's performance, use

python train_net.py \
  --config-file configs/R50_ovis_360.yaml \
  --eval-only \
  MODEL.WEIGHTS /path/to/checkpoint_file

Model Zoo

Pretrained weights on COCO

Name	R50	Swin-L
MDQE	model, config	model, config

OVIS

Name	Backbone	Frames	AP	Download
MDQE	R50	f4+360p	30.7	model, config
MDQE	R50	f4+640p	32.3	model, config
MDQE	Swin-L	f2+480p	41.0	model, config
MDQE	Swin-L	f2+640p	42.6	model, config

YouTubeVIS-2021

Name	Backbone	Frames	AP	Download
MDQE	R50	f4+360p	46.6	model, config
MDQE	Swin-L	f3+360p	55.5	model, config

YouTubeVIS-2019

Name	Backbone	Frames	AP	Download
MDQE	R50	f4+360p	47.8	model, config
MDQE	Swin-L	f3+360p	59.9	model, config

License

The majority of MDQE is licensed under the Apache-2.0 License. However, portions of the project are available under separate license terms: Detectron2(Apache-2.0 License), IFC(Apache-2.0 License), VITA(Apache-2.0 License), and Deformable-DETR(Apache-2.0 License).

Citing MDQE

If you use MDQE in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@misc{li2023mdqe,
    title={MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos},
    author={Minghan Li and Shuai Li and Wangmeng Xiang and Lei Zhang},
    year={2023},
    eprint={2303.14395},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Acknowledgement

Our code is largely based on Detectron2, IFC, Deformable DETR and VITA. We are truly grateful for their excellent work.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
configs		configs
datasets		datasets
demo		demo
imgs		imgs
mdqe		mdqe
tools		tools
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
convert_dataset.py		convert_dataset.py
convert_inflated_weights.py		convert_inflated_weights.py
requirements.txt		requirements.txt
run.sh		run.sh
test.sh		test.sh
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

Updates

Installation

Getting Started

Model Zoo

Pretrained weights on COCO

OVIS

YouTubeVIS-2021

YouTubeVIS-2019

License

Citing MDQE

Acknowledgement

About

Releases

Packages

Languages

MinghanLi/MDQE_CVPR2023

Folders and files

Latest commit

History

Repository files navigation

MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

Updates

Installation

Getting Started

Model Zoo

Pretrained weights on COCO

OVIS

YouTubeVIS-2021

YouTubeVIS-2019

License

Citing MDQE

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages