We provide instructions to reproduce class-agnostic object detection results of MDef-DETR with and without language branch. Please refer to Tables 1, 2, 4 & 5 of our paper for more details.
Download the datasets (annotations & images) and arrange them as,
code_root/
└─ data
└─ voc2007
├─ Annotations
├─ JPEGImages
└─ coco
├─ instances_val2017.json
├─ val2017
└─ kitti
├─ Annotations
├─ JPEGImages
└─ kitchen
├─ Annotations
├─ JPEGImages
└─ cliaprt
├─ Annotations
├─ JPEGImages
└─ comic
├─ Annotations
├─ JPEGImages
└─ watercolor
├─ Annotations
├─ JPEGImages
└─ dota
├─ Annotations
├─ JPEGImages
Once the above directory structure is created,
- Download the pretrained weights from this link.
- Set the environment variable
export PYTHONPATH="./:$PYTHONPATH"
- Run the following script to generate predictions and calculate metrics.
- MDef-DETR
bash scripts/get_mvit_multi_query_metrics.sh <dataset root dir path> <model checkpoints path>
- MDef-DETR w/o Language Branch (trained by maintaining the structure introduced by captions)
bash scripts/get_mvit_minus_language_metrics.sh <dataset root dir path> <model checkpoints path>
The calculated evaluation metrics will be stored in a *.csv
file in the same directory.