Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector (CVPR'2020)
Conventional methods for object detection typically require a substantial amount of training data and preparing such high-quality training data is very labor-intensive. In this paper, we propose a novel few-shot object detection network that aims at detecting objects of unseen categories with only a few annotated examples. Central to our method are our Attention-RPN, Multi-Relation Detector and Contrastive Training strategy, which exploit the similarity between the few shot support set and query set to detect novel objects while suppressing false detection in the background. To train our network, we contribute a new dataset that contains 1000 categories of various objects with high-quality annotations. To the best of our knowledge, this is one of the first datasets specifically designed for few-shot object detection. Once our few-shot network is trained, it can detect objects of unseen categories without further training or finetuning. Our method is general and has a wide range of potential applications. We produce a new state-of-the-art performance on different datasets in the few-shot setting. The dataset link is https://github.com/fanq15/Few-Shot-Object-Detection-Dataset.
@inproceedings{fan2020fsod,
title={Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector},
author={Fan, Qi and Zhuo, Wei and Tang, Chi-Keung and Tai, Yu-Wing},
booktitle={CVPR},
year={2020}
}
Note: ALL the reported results use the data split released from TFA official repo. Currently, each setting is only evaluated with one fixed few shot dataset. Please refer to DATA Preparation to get more details about the dataset and data preparation.
Following the original implementation, it consists of 2 steps:
-
Step1: Base training
- use all the images and annotations of base classes to train a base model.
-
Step2: Few shot fine-tuning:
- use the base model from step1 as model initialization and further fine tune the model with few shot datasets.
# step1: base training for voc split1
bash ./tools/detection/dist_train.sh \
configs/detection/attention_rpn/voc/split1/attention-rpn_r50_c4_voc-split1_base-training.py 8
# step2: few shot fine-tuning
bash ./tools/detection/dist_train.sh \
configs/detection/attention_rpn/voc/split1/attention-rpn_r50_c4_voc-split1_1shot-fine-tuning.py 8
Note:
- The default output path of base model in step1 is set to
work_dirs/{BASE TRAINING CONFIG}/latest.pth
. When the model is saved to different path, please update the argumentload_from
in step2 few shot fine-tune configs instead of usingresume_from
. - To use pre-trained checkpoint, please set the
load_from
to the downloaded checkpoint path.
Note:
- The paper doesn't conduct experiments of VOC dataset. Therefore, we use the VOC setting of TFA to evaluate the method.
- Some implementation details should be noticed:
- The training batch size are 8x2 for all the VOC experiments and 4x2 for all the COCO experiments(following the official repo).
- Only the roi head will be trained during few shot fine-tuning for VOC experiments.
- The iterations or training strategy for VOC experiments may not be the optimal.
- The performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.
- The difficult samples will not be used in base training or few shot setting.
Arch | Split | Base AP50 | ckpt | log |
---|---|---|---|---|
r50 c4 | 1 | 71.9 | ckpt | log |
r50 c4 | 2 | 73.5 | ckpt | log |
r50 c4 | 3 | 73.4 | ckpt | log |
Arch | Split | Shot | Novel AP50 | ckpt | log |
---|---|---|---|---|---|
r50 c4 | 1 | 1 | 35.0 | ckpt | log |
r50 c4 | 1 | 2 | 36.0 | ckpt | log |
r50 c4 | 1 | 3 | 39.1 | ckpt | log |
r50 c4 | 1 | 5 | 51.7 | ckpt | log |
r50 c4 | 1 | 10 | 55.7 | ckpt | log |
r50 c4 | 2 | 1 | 20.8 | ckpt | log |
r50 c4 | 2 | 2 | 23.4 | ckpt | log |
r50 c4 | 2 | 3 | 35.9 | ckpt | log |
r50 c4 | 2 | 5 | 37.0 | ckpt | log |
r50 c4 | 2 | 10 | 43.3 | ckpt | log |
r50 c4 | 3 | 1 | 31.9 | ckpt | log |
r50 c4 | 3 | 2 | 30.8 | ckpt | log |
r50 c4 | 3 | 3 | 38.2 | ckpt | log |
r50 c4 | 3 | 5 | 48.9 | ckpt | log |
r50 c4 | 3 | 10 | 51.6 | ckpt | log |
Note:
- Following the original implementation, the training batch size are 4x2 for all the COCO experiments.
- The official implementation use different COCO data split from TFA, and we report the results of both setting. To reproduce the result following official data split (coco 17), please refer to Data Preparation to get more details about data preparation.
- The performance of the base training and few shot setting can be unstable, even using the same random seed. To reproduce the reported few shot results, it is highly recommended using the released model for few shot fine-tuning.
Arch | data source | Base mAP | ckpt | log |
---|---|---|---|---|
r50 c4 | TFA | 23.6 | ckpt | log |
r50 c4 | official repo | 24.0 | ckpt | log |
Arch | data source | Shot | Novel mAP | ckpt | log |
---|---|---|---|---|---|
r50 c4 | TFA | 10 | 9.2 | ckpt | log |
r50 c4 | TFA | 30 | 14.8 | ckpt | log |
r50 c4 | official repo | 10 | 11.6 | ckpt | log |