This repository holds a small framework to evaluate unsupervised domain adaptation methods in combination with a CenterNet object detection network. But it also allows it to train baselines without UDA methods. There also support for rotated bounding boxes as you can see in the Fig. 1 below.
Fig 1.: Detection example of axis aligned and rotated bounding boxes. The first column is the detection result and the last column the ground truth.
- ADVENT
- Direct Entropy minimization
- Minimizing entropy with adversarial learning
- Domain Adaptation for Semantic Segmentation with Maximum Squares Loss
- Max Squares minimization
- Image-wise Class-balanced Weighting Factor
- Multi-level Self-produced guidance
- FDA
Install all required modules:
pip install -r requirements.txt
If you plan to run the DLA network or MobileNet v2 with Deformable Convolutional Networks V2 you have to compile the DCNv2 lib.
cd libs/DCNv2 && ./make.sh
This implementation uses hydra as configuration framework. Mainly you can check all available configuration options with:
python train.py --help
To run training you can execute the following command which loads the default experiment (configs/defaults.yaml)
python train.py
A custom experiment will load the default parameters, which you can easily overwrite by placing another yaml configuration file into the folder configs/experiment/
. E.g. the experiment entropy_minimization
can be executed with:
python train.py experiment=entropy_minimization
This file overwrites the attribute experiment
and the uda
property.
experiment: default # experiment name and also folder name (outputs/default) where logs a.s.o. are saved
# path to pretrained weights
# optimizer states are not restored
pretrained: /mnt/data/Projects/centernet-uda/weights/coco_dla_2x.pth
# path to last trained model
# will restore optimizer states
# if pretraned path is given, resume path is used
resume:
# defines the training model
model:
# what kind of backend will be used
# valid values are all available models in backends folder
# currenlty only dla (Deep Layer Aggregation) is implemented
backend:
name: dla
params: # valid params are listed in e.g. backends.dla.build
num_layers: 34
num_classes: 6
# defines the loss function for centernet (should be as it is)
# weights can be overwritten if necessary
loss:
name: centernet.DetectionLoss
params:
hm_weight: 1.0
off_weight: 1.0
wh_weight: 0.1
angle_weight: 1.0 # weight for angle term in loss (only for rotated boxes)
periodic: False # if true RAPiD periodic loss will be used for angle
# used unsupervised domain adaptation method for the experiment
# in this case none UDA method is used, only the centernet is trained
uda:
datasets:
# paraemters for training dataset
training:
# all available readers are listed in datasets/
name: datasets.coco
# valid parameters are all paramters in __init__
params:
image_folder: /mnt/data/datasets/theodore_plus/images/
annotation_file: /mnt/data/datasets/theodore_plus/coco/annotations/instances.json
target_domain_glob:
- /mnt/data/datasets/DST/2020-02-14-14h30m47s/*.jpg
- /mnt/data/datasets/CEPDOF/**/*.jpg
# data augmentation is implemented via imgaug
# valid augmentors and parameters are listed here:
# https://imgaug.readthedocs.io/en/latest/source/overview_of_augmenters.html
augmentation:
- GammaContrast:
gamma: [0.2, 1.5]
- Affine:
translate_percent: [-0.1, 0.1]
scale: [0.8, 1.3]
rotate: [-45, 45]
- AdditiveGaussianNoise:
scale: [0, 10]
- Fliplr:
p: 0.5
- Flipud:
p: 0.5
# paraemters for validation dataset
validation:
name: datasets.coco
params:
image_folder: /mnt/data/datasets/FES/JPEGImages/
annotation_file: /mnt/data/datasets/FES/coco/annotations/instances_training.json
# paraemters for test dataset
test:
name: datasets.coco
params:
image_folder: /mnt/data/datasets/CEPDOF/coco/images/
annotation_file: /mnt/data/datasets/CEPDOF/coco/annotations/instances_test.json
# parameters to normalize an image, additional to pixel / 255 normalization
normalize:
mean: [0.40789654, 0.44719302, 0.47026115]
std: [0.28863828, 0.27408164, 0.27809835]
# all available optimizer from https://pytorch.org/docs/stable/optim.html
# are valid values, e.g. using sgd instead of adam, name should be changed
# to SGD https://pytorch.org/docs/stable/optim.html#torch.optim.SGD
# valid params are all listed params for an optimizer
# e.g. nesterov: True will be valid for SGD but not for Adam
optimizer:
name: Adam # optimizer
params:
lr: 0.0001 # learning rate
scheduler: # None or one of torch.optim.lr_scheduler
name: MultiStepLR
params:
milestones: [20, 40]
gamma: 0.1
# which evaluation framework should be used
# currently only mscoco is implemented but can be extended to pascal voc
evaluation:
coco:
per_class: True # if true for each class a mAP, Precision and Recall will be logged
tensorboard:
num_visualizations: 50 # number of images with detections to show in tensorobard
score_threshold: 0.3 # threshold which should be reached to be a valid bounding box
max_detections: 150 # maximum number of detections per image
epochs: 100 # number of epochs to train
batch_size: 16 # batch size
num_workers: 4 # number of parallel workers are used for the data loader
seed: 42 # random seed
gpu: 0 # gpu id to use for training or list of gpus for multi gpu training
eval_at_n_epoch: 1 # every N epoch the validation will be executed (epoch % N == 0)
# how to identify a "best" model, what metric describes it
save_best_metric:
name: validation/total_loss # can be training/total_loss, validation/total_loss or MSCOCO_Precision/mAP
mode: min # what means best for the metric, is smaller (min) better or bigger (max)
- Deep Layer Aggregation with 34 layers
- MobileNetV2 with 𝛼 = 1.0
- ResNet with 18, 34, 50, 101, 152 layers
- EfficientNet Variant b[0 - 8]
To train CenterNet with your own dataset you have to convert it first into the MS COCO format. Each file_name value is specified without a path.
Image example
...
"images": [
{
"id": 1,
"width": 1680,
"height": 1680,
"file_name": "Record_00600.jpg",
"license": 0,
"flickr_url": "",
"coco_url": "",
"date_captured": 0
},
...
Annotation example As in the example below, there exists a bbox and a rbbox key.
- bbox - axis aligned bounding box like typical for MS COCO (
[x,y,width,height]
) - rbbox - bounding box in format
[cx, cy, width, height, angle]
(OpenCV minAreaRect)
{
"id": 17,
"image_id": 1,
"category_id": 1,
"segmentation": [
[
440.66,
...,
575.68
]
],
"area": 16656,
"bbox": [
403.8,
538.84,
163.96,
147.37
],
"iscrowd": 0,
"attributes": {
"occluded": false
},
"rbbox": [
473.5887145996094,
617.488037109375,
160.44503784179688,
141.49945068359375,
-19.881549835205078
]
}
WIP
Objects as Points, Xingyi Zhou, et al., arXiv technical report (arXiv 1904.07850)
RAPiD: Rotation-Aware People Detection in Overhead Fisheye Images, Zhihao Duan, Ozan Tezcan, Hayato Nakamura, Prakash Ishwar, Janusz Konrad, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020
Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. Vu, Tuan-Hung, et al., Proceedings of the IEEE conference on computer vision and pattern recognition. 2019.
Domain adaptation for semantic segmentation with maximum squares loss. Chen, Minghao, et. al., Proceedings of the IEEE International Conference on Computer Vision. 2019