gap in map #19

x-x110 · 2021-05-23T03:31:49Z

My experimental equipment is 3xTitan, and according to the rules of Detectron2, set the learning rate to 0.045.
Without modifying any parameters, the resulting map is about 35.6.
why?

x-x110 · 2021-05-23T03:32:24Z

batch size 16 / per

chensnathan · 2021-05-23T03:55:46Z

Hi, Could you post your training log?

x-x110 · 2021-05-23T05:04:51Z

CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: true
NUM_WORKERS: 8
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:

coco_2017_val
TRAIN:
coco_2017_train
GLOBAL:
HACK: 1.0
INPUT:
CROP:
ENABLED: false
SIZE:
- 0.9
- 0.9
  TYPE: relative_range
  DISTORTION:
  ENABLED: false
  EXPOSURE: 1.5
  HUE: 0.1
  SATURATION: 1.5
  FORMAT: BGR
  JITTER_CROP:
  ENABLED: false
  JITTER_RATIO: 0.3
  MASK_FORMAT: polygon
  MAX_SIZE_TEST: 1333
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MIN_SIZE_TRAIN:
800
MIN_SIZE_TRAIN_SAMPLING: choice
MOSAIC:
ENABLED: false
MIN_OFFSET: 0.2
MOSAIC_HEIGHT: 640
MOSAIC_WIDTH: 640
NUM_IMAGES: 4
POOL_CAPACITY: 1000
RANDOM_FLIP: horizontal
RESIZE:
ENABLED: false
SCALE_JITTER:
- 0.8
- 1.2
  SHAPE:
- 640
- 640
  TEST_SHAPE:
- 608
- 608
  SHIFT:
  SHIFT_PIXELS: 32
  MODEL:
  ANCHOR_GENERATOR:
  ANGLES:
- - -90
  - 0
  - 90
    ASPECT_RATIOS:
- - 1.0
    NAME: DefaultAnchorGenerator
    OFFSET: 0.0
    SIZES:
- - 32
  - 64
  - 128
  - 256
  - 512
    BACKBONE:
    FREEZE_AT: 2
    NAME: build_resnet_backbone
    DARKNET:
    DEPTH: 53
    NORM: BN
    OUT_FEATURES:
- res5
  RES5_DILATION: 1
  WITH_CSP: true
  DEVICE: cuda
  FPN:
  FUSE_TYPE: sum
  IN_FEATURES: []
  NORM: ''
  OUT_CHANNELS: 256
  KEYPOINT_ON: false
  LOAD_PROPOSALS: false
  MASK_ON: false
  META_ARCHITECTURE: YOLOF
  PANOPTIC_FPN:
  COMBINE:
  ENABLED: true
  INSTANCES_CONFIDENCE_THRESH: 0.5
  OVERLAP_THRESH: 0.5
  STUFF_AREA_LIMIT: 4096
  INSTANCE_LOSS_WEIGHT: 1.0
  PIXEL_MEAN:
103.53
116.28
123.675
PIXEL_STD:
1.0
1.0
1.0
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
RESNETS:
DEFORM_MODULATED: false
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE:
- false
- false
- false
- false
  DEPTH: 50
  NORM: FrozenBN
  NUM_GROUPS: 1
  OUT_FEATURES:
- res5
  RES2_OUT_CHANNELS: 256
  RES5_DILATION: 1
  STEM_OUT_CHANNELS: 64
  STRIDE_IN_1X1: true
  WIDTH_PER_GROUP: 64
  RETINANET:
  BBOX_REG_LOSS_TYPE: smooth_l1
  BBOX_REG_WEIGHTS: &id001
- 1.0
- 1.0
- 1.0
- 1.0
  FOCAL_LOSS_ALPHA: 0.25
  FOCAL_LOSS_GAMMA: 2.0
  IN_FEATURES:
- p3
- p4
- p5
- p6
- p7
  IOU_LABELS:
- 0
- -1
- 1
  IOU_THRESHOLDS:
- 0.4
- 0.5
  NMS_THRESH_TEST: 0.5
  NORM: ''
  NUM_CLASSES: 80
  NUM_CONVS: 4
  PRIOR_PROB: 0.01
  SCORE_THRESH_TEST: 0.05
  SMOOTH_L1_LOSS_BETA: 0.1
  TOPK_CANDIDATES_TEST: 1000
  ROI_BOX_CASCADE_HEAD:
  BBOX_REG_WEIGHTS:
- - 10.0
  - 10.0
  - 5.0
  - 5.0
- - 20.0
  - 20.0
  - 10.0
  - 10.0
- - 30.0
  - 30.0
  - 15.0
  - 15.0
    IOUS:
- 0.5
- 0.6
- 0.7
  ROI_BOX_HEAD:
  BBOX_REG_LOSS_TYPE: smooth_l1
  BBOX_REG_LOSS_WEIGHT: 1.0
  BBOX_REG_WEIGHTS:
- 10.0
- 10.0
- 5.0
- 5.0
  CLS_AGNOSTIC_BBOX_REG: false
  CONV_DIM: 256
  FC_DIM: 1024
  NAME: ''
  NORM: ''
  NUM_CONV: 0
  NUM_FC: 0
  POOLER_RESOLUTION: 14
  POOLER_SAMPLING_RATIO: 0
  POOLER_TYPE: ROIAlignV2
  SMOOTH_L1_BETA: 0.0
  TRAIN_ON_PRED_BOXES: false
  ROI_HEADS:
  BATCH_SIZE_PER_IMAGE: 512
  IN_FEATURES:
- res4
  IOU_LABELS:
- 0
- 1
  IOU_THRESHOLDS:
- 0.5
  NAME: Res5ROIHeads
  NMS_THRESH_TEST: 0.5
  NUM_CLASSES: 80
  POSITIVE_FRACTION: 0.25
  PROPOSAL_APPEND_GT: true
  SCORE_THRESH_TEST: 0.05
  ROI_KEYPOINT_HEAD:
  CONV_DIMS:
- 512
- 512
- 512
- 512
- 512
- 512
- 512
- 512
  LOSS_WEIGHT: 1.0
  MIN_KEYPOINTS_PER_IMAGE: 1
  NAME: KRCNNConvDeconvUpsampleHead
  NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
  NUM_KEYPOINTS: 17
  POOLER_RESOLUTION: 14
  POOLER_SAMPLING_RATIO: 0
  POOLER_TYPE: ROIAlignV2
  ROI_MASK_HEAD:
  CLS_AGNOSTIC_MASK: false
  CONV_DIM: 256
  NAME: MaskRCNNConvUpsampleHead
  NORM: ''
  NUM_CONV: 0
  POOLER_RESOLUTION: 14
  POOLER_SAMPLING_RATIO: 0
  POOLER_TYPE: ROIAlignV2
  RPN:
  BATCH_SIZE_PER_IMAGE: 256
  BBOX_REG_LOSS_TYPE: smooth_l1
  BBOX_REG_LOSS_WEIGHT: 1.0
  BBOX_REG_WEIGHTS: *id001
  BOUNDARY_THRESH: -1
  CONV_DIMS:
- -1
  HEAD_NAME: StandardRPNHead
  IN_FEATURES:
- res4
  IOU_LABELS:
- 0
- -1
- 1
  IOU_THRESHOLDS:
- 0.3
- 0.7
  LOSS_WEIGHT: 1.0
  NMS_THRESH: 0.7
  POSITIVE_FRACTION: 0.5
  POST_NMS_TOPK_TEST: 1000
  POST_NMS_TOPK_TRAIN: 2000
  PRE_NMS_TOPK_TEST: 6000
  PRE_NMS_TOPK_TRAIN: 12000
  SMOOTH_L1_BETA: 0.0
  SEM_SEG_HEAD:
  COMMON_STRIDE: 4
  CONVS_DIM: 128
  IGNORE_VALUE: 255
  IN_FEATURES:
- p2
- p3
- p4
- p5
  LOSS_WEIGHT: 1.0
  NAME: SemSegFPNHead
  NORM: GN
  NUM_CLASSES: 54
  WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl
  YOLOF:
  BOX_TRANSFORM:
  ADD_CTR_CLAMP: true
  BBOX_REG_WEIGHTS:
  - 1.0
  - 1.0
  - 1.0
  - 1.0
    CTR_CLAMP: 32
    DECODER:
    ACTIVATION: ReLU
    CLS_NUM_CONVS: 2
    IN_CHANNELS: 512
    NORM: BN
    NUM_ANCHORS: 5
    NUM_CLASSES: 80
    PRIOR_PROB: 0.01
    REG_NUM_CONVS: 4
    DETECTIONS_PER_IMAGE: 100
    ENCODER:
    ACTIVATION: ReLU
    BACKBONE_LEVEL: res5
    BLOCK_DILATIONS:
  - 2
  - 4
  - 6
  - 8
    BLOCK_MID_CHANNELS: 128
    IN_CHANNELS: 2048
    NORM: BN
    NUM_CHANNELS: 512
    NUM_RESIDUAL_BLOCKS: 4
    LOSSES:
    BBOX_REG_LOSS_TYPE: giou
    FOCAL_LOSS_ALPHA: 0.25
    FOCAL_LOSS_GAMMA: 2.0
    MATCHER:
    TOPK: 4
    NEG_IGNORE_THRESHOLD: 0.7
    NMS_THRESH_TEST: 0.6
    POS_IGNORE_THRESHOLD: 0.15
    SCORE_THRESH_TEST: 0.05
    TOPK_CANDIDATES_TEST: 1000
    OUTPUT_DIR: output/yolof/R_50_C5_1x
    SEED: -1
    SOLVER:
    AMP:
    ENABLED: false
    BACKBONE_MULTIPLIER: 0.334
    BASE_LR: 0.045
    BIAS_LR_FACTOR: 1.0
    CHECKPOINT_PERIOD: 2500
    CLIP_GRADIENTS:
    CLIP_TYPE: value
    CLIP_VALUE: 1.0
    ENABLED: false
    NORM_TYPE: 2.0
    GAMMA: 0.1
    IMS_PER_BATCH: 48
    LR_SCHEDULER_NAME: WarmupMultiStepLR
    MAX_ITER: 22500
    MOMENTUM: 0.9
    NESTEROV: false
    REFERENCE_WORLD_SIZE: 0
    STEPS:
15000
20000
WARMUP_FACTOR: 0.00066667
WARMUP_ITERS: 1500
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0.0001
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: false
FLIP: true
MAX_SIZE: 4000
MIN_SIZES:
- 400
- 500
- 600
- 700
- 800
- 900
- 1000
- 1100
- 1200
  DETECTIONS_PER_IMAGE: 100
  EVAL_PERIOD: 0
  EXPECTED_RESULTS: []
  KEYPOINT_OKS_SIGMAS: []
  PRECISE_BN:
  ENABLED: false
  NUM_ITER: 200
  VERSION: 2
  VIS_PERIOD: 0

chensnathan · 2021-05-23T06:45:42Z

There are several key points about how to modify the settings:

Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.
Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]
The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

x-x110 · 2021-05-23T07:00:16Z

I will modify these parameters and provide the result. Thank you for your reply

x-x110 · 2021-05-24T02:19:33Z

After modifying these parameters, 37.39 can be obtained in the iteration times of 30000

chensnathan · 2021-05-24T02:29:43Z

This result is reasonable.

x-x110 · 2021-05-24T02:33:56Z

thanks you reply

shenhaibb · 2022-04-23T10:54:27Z

There are several key points about how to modify the settings:

Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.

Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]

The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

hi, i have only 1gpu(8gb)
i set batch size as 8
learning rate: 0.12 * 8 / 64 = 0.0015
maximum iteration: 22500 * 64 / 8 = 180000
learning rate steps: [15000 * 64 / 8, 20000 * 64 / 48] == [120000, 160000]
warm up iterations: 1500 * 64 / 8 = 12000
warmup factor: 1. / 2000 = 0.0005
Is it the way I calculated it？

SelimSavas · 2023-05-10T11:44:00Z

There are several key points about how to modify the settings:

Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.

Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]

The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

Is this calculation valid for every data set? Are we going to do the same calculation for datasets of different sizes? @chensnathan

chensnathan closed this as completed May 24, 2021

chensnathan mentioned this issue Jun 21, 2021

AssertionError: bad box: x1 larger than x2 #24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gap in map #19

gap in map #19

x-x110 commented May 23, 2021

x-x110 commented May 23, 2021

chensnathan commented May 23, 2021

x-x110 commented May 23, 2021

chensnathan commented May 23, 2021

x-x110 commented May 23, 2021

x-x110 commented May 24, 2021

chensnathan commented May 24, 2021

x-x110 commented May 24, 2021

shenhaibb commented Apr 23, 2022

SelimSavas commented May 10, 2023 •

edited

Loading

gap in map #19

gap in map #19

Comments

x-x110 commented May 23, 2021

x-x110 commented May 23, 2021

chensnathan commented May 23, 2021

x-x110 commented May 23, 2021

chensnathan commented May 23, 2021

x-x110 commented May 23, 2021

x-x110 commented May 24, 2021

chensnathan commented May 24, 2021

x-x110 commented May 24, 2021

shenhaibb commented Apr 23, 2022

SelimSavas commented May 10, 2023 • edited Loading

SelimSavas commented May 10, 2023 •

edited

Loading