Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gap in map #19

Closed
x-x110 opened this issue May 23, 2021 · 10 comments
Closed

gap in map #19

x-x110 opened this issue May 23, 2021 · 10 comments

Comments

@x-x110
Copy link

x-x110 commented May 23, 2021

My experimental equipment is 3xTitan, and according to the rules of Detectron2, set the learning rate to 0.045.
Without modifying any parameters, the resulting map is about 35.6.
why?

@x-x110
Copy link
Author

x-x110 commented May 23, 2021

batch size 16 / per

@chensnathan
Copy link
Owner

Hi, Could you post your training log?

@x-x110
Copy link
Author

x-x110 commented May 23, 2021

CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: true
NUM_WORKERS: 8
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:

  • coco_2017_val
    TRAIN:
  • coco_2017_train
    GLOBAL:
    HACK: 1.0
    INPUT:
    CROP:
    ENABLED: false
    SIZE:
    • 0.9
    • 0.9
      TYPE: relative_range
      DISTORTION:
      ENABLED: false
      EXPOSURE: 1.5
      HUE: 0.1
      SATURATION: 1.5
      FORMAT: BGR
      JITTER_CROP:
      ENABLED: false
      JITTER_RATIO: 0.3
      MASK_FORMAT: polygon
      MAX_SIZE_TEST: 1333
      MAX_SIZE_TRAIN: 1333
      MIN_SIZE_TEST: 800
      MIN_SIZE_TRAIN:
  • 800
    MIN_SIZE_TRAIN_SAMPLING: choice
    MOSAIC:
    ENABLED: false
    MIN_OFFSET: 0.2
    MOSAIC_HEIGHT: 640
    MOSAIC_WIDTH: 640
    NUM_IMAGES: 4
    POOL_CAPACITY: 1000
    RANDOM_FLIP: horizontal
    RESIZE:
    ENABLED: false
    SCALE_JITTER:
    • 0.8
    • 1.2
      SHAPE:
    • 640
    • 640
      TEST_SHAPE:
    • 608
    • 608
      SHIFT:
      SHIFT_PIXELS: 32
      MODEL:
      ANCHOR_GENERATOR:
      ANGLES:
      • -90
      • 0
      • 90
        ASPECT_RATIOS:
      • 1.0
        NAME: DefaultAnchorGenerator
        OFFSET: 0.0
        SIZES:
      • 32
      • 64
      • 128
      • 256
      • 512
        BACKBONE:
        FREEZE_AT: 2
        NAME: build_resnet_backbone
        DARKNET:
        DEPTH: 53
        NORM: BN
        OUT_FEATURES:
    • res5
      RES5_DILATION: 1
      WITH_CSP: true
      DEVICE: cuda
      FPN:
      FUSE_TYPE: sum
      IN_FEATURES: []
      NORM: ''
      OUT_CHANNELS: 256
      KEYPOINT_ON: false
      LOAD_PROPOSALS: false
      MASK_ON: false
      META_ARCHITECTURE: YOLOF
      PANOPTIC_FPN:
      COMBINE:
      ENABLED: true
      INSTANCES_CONFIDENCE_THRESH: 0.5
      OVERLAP_THRESH: 0.5
      STUFF_AREA_LIMIT: 4096
      INSTANCE_LOSS_WEIGHT: 1.0
      PIXEL_MEAN:
  • 103.53
  • 116.28
  • 123.675
    PIXEL_STD:
  • 1.0
  • 1.0
  • 1.0
    PROPOSAL_GENERATOR:
    MIN_SIZE: 0
    NAME: RPN
    RESNETS:
    DEFORM_MODULATED: false
    DEFORM_NUM_GROUPS: 1
    DEFORM_ON_PER_STAGE:
    • false
    • false
    • false
    • false
      DEPTH: 50
      NORM: FrozenBN
      NUM_GROUPS: 1
      OUT_FEATURES:
    • res5
      RES2_OUT_CHANNELS: 256
      RES5_DILATION: 1
      STEM_OUT_CHANNELS: 64
      STRIDE_IN_1X1: true
      WIDTH_PER_GROUP: 64
      RETINANET:
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_WEIGHTS: &id001
    • 1.0
    • 1.0
    • 1.0
    • 1.0
      FOCAL_LOSS_ALPHA: 0.25
      FOCAL_LOSS_GAMMA: 2.0
      IN_FEATURES:
    • p3
    • p4
    • p5
    • p6
    • p7
      IOU_LABELS:
    • 0
    • -1
    • 1
      IOU_THRESHOLDS:
    • 0.4
    • 0.5
      NMS_THRESH_TEST: 0.5
      NORM: ''
      NUM_CLASSES: 80
      NUM_CONVS: 4
      PRIOR_PROB: 0.01
      SCORE_THRESH_TEST: 0.05
      SMOOTH_L1_LOSS_BETA: 0.1
      TOPK_CANDIDATES_TEST: 1000
      ROI_BOX_CASCADE_HEAD:
      BBOX_REG_WEIGHTS:
      • 10.0
      • 10.0
      • 5.0
      • 5.0
      • 20.0
      • 20.0
      • 10.0
      • 10.0
      • 30.0
      • 30.0
      • 15.0
      • 15.0
        IOUS:
    • 0.5
    • 0.6
    • 0.7
      ROI_BOX_HEAD:
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_LOSS_WEIGHT: 1.0
      BBOX_REG_WEIGHTS:
    • 10.0
    • 10.0
    • 5.0
    • 5.0
      CLS_AGNOSTIC_BBOX_REG: false
      CONV_DIM: 256
      FC_DIM: 1024
      NAME: ''
      NORM: ''
      NUM_CONV: 0
      NUM_FC: 0
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      SMOOTH_L1_BETA: 0.0
      TRAIN_ON_PRED_BOXES: false
      ROI_HEADS:
      BATCH_SIZE_PER_IMAGE: 512
      IN_FEATURES:
    • res4
      IOU_LABELS:
    • 0
    • 1
      IOU_THRESHOLDS:
    • 0.5
      NAME: Res5ROIHeads
      NMS_THRESH_TEST: 0.5
      NUM_CLASSES: 80
      POSITIVE_FRACTION: 0.25
      PROPOSAL_APPEND_GT: true
      SCORE_THRESH_TEST: 0.05
      ROI_KEYPOINT_HEAD:
      CONV_DIMS:
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
    • 512
      LOSS_WEIGHT: 1.0
      MIN_KEYPOINTS_PER_IMAGE: 1
      NAME: KRCNNConvDeconvUpsampleHead
      NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
      NUM_KEYPOINTS: 17
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      ROI_MASK_HEAD:
      CLS_AGNOSTIC_MASK: false
      CONV_DIM: 256
      NAME: MaskRCNNConvUpsampleHead
      NORM: ''
      NUM_CONV: 0
      POOLER_RESOLUTION: 14
      POOLER_SAMPLING_RATIO: 0
      POOLER_TYPE: ROIAlignV2
      RPN:
      BATCH_SIZE_PER_IMAGE: 256
      BBOX_REG_LOSS_TYPE: smooth_l1
      BBOX_REG_LOSS_WEIGHT: 1.0
      BBOX_REG_WEIGHTS: *id001
      BOUNDARY_THRESH: -1
      CONV_DIMS:
    • -1
      HEAD_NAME: StandardRPNHead
      IN_FEATURES:
    • res4
      IOU_LABELS:
    • 0
    • -1
    • 1
      IOU_THRESHOLDS:
    • 0.3
    • 0.7
      LOSS_WEIGHT: 1.0
      NMS_THRESH: 0.7
      POSITIVE_FRACTION: 0.5
      POST_NMS_TOPK_TEST: 1000
      POST_NMS_TOPK_TRAIN: 2000
      PRE_NMS_TOPK_TEST: 6000
      PRE_NMS_TOPK_TRAIN: 12000
      SMOOTH_L1_BETA: 0.0
      SEM_SEG_HEAD:
      COMMON_STRIDE: 4
      CONVS_DIM: 128
      IGNORE_VALUE: 255
      IN_FEATURES:
    • p2
    • p3
    • p4
    • p5
      LOSS_WEIGHT: 1.0
      NAME: SemSegFPNHead
      NORM: GN
      NUM_CLASSES: 54
      WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl
      YOLOF:
      BOX_TRANSFORM:
      ADD_CTR_CLAMP: true
      BBOX_REG_WEIGHTS:
      • 1.0
      • 1.0
      • 1.0
      • 1.0
        CTR_CLAMP: 32
        DECODER:
        ACTIVATION: ReLU
        CLS_NUM_CONVS: 2
        IN_CHANNELS: 512
        NORM: BN
        NUM_ANCHORS: 5
        NUM_CLASSES: 80
        PRIOR_PROB: 0.01
        REG_NUM_CONVS: 4
        DETECTIONS_PER_IMAGE: 100
        ENCODER:
        ACTIVATION: ReLU
        BACKBONE_LEVEL: res5
        BLOCK_DILATIONS:
      • 2
      • 4
      • 6
      • 8
        BLOCK_MID_CHANNELS: 128
        IN_CHANNELS: 2048
        NORM: BN
        NUM_CHANNELS: 512
        NUM_RESIDUAL_BLOCKS: 4
        LOSSES:
        BBOX_REG_LOSS_TYPE: giou
        FOCAL_LOSS_ALPHA: 0.25
        FOCAL_LOSS_GAMMA: 2.0
        MATCHER:
        TOPK: 4
        NEG_IGNORE_THRESHOLD: 0.7
        NMS_THRESH_TEST: 0.6
        POS_IGNORE_THRESHOLD: 0.15
        SCORE_THRESH_TEST: 0.05
        TOPK_CANDIDATES_TEST: 1000
        OUTPUT_DIR: output/yolof/R_50_C5_1x
        SEED: -1
        SOLVER:
        AMP:
        ENABLED: false
        BACKBONE_MULTIPLIER: 0.334
        BASE_LR: 0.045
        BIAS_LR_FACTOR: 1.0
        CHECKPOINT_PERIOD: 2500
        CLIP_GRADIENTS:
        CLIP_TYPE: value
        CLIP_VALUE: 1.0
        ENABLED: false
        NORM_TYPE: 2.0
        GAMMA: 0.1
        IMS_PER_BATCH: 48
        LR_SCHEDULER_NAME: WarmupMultiStepLR
        MAX_ITER: 22500
        MOMENTUM: 0.9
        NESTEROV: false
        REFERENCE_WORLD_SIZE: 0
        STEPS:
  • 15000
  • 20000
    WARMUP_FACTOR: 0.00066667
    WARMUP_ITERS: 1500
    WARMUP_METHOD: linear
    WEIGHT_DECAY: 0.0001
    WEIGHT_DECAY_BIAS: 0.0001
    WEIGHT_DECAY_NORM: 0.0
    TEST:
    AUG:
    ENABLED: false
    FLIP: true
    MAX_SIZE: 4000
    MIN_SIZES:
    • 400
    • 500
    • 600
    • 700
    • 800
    • 900
    • 1000
    • 1100
    • 1200
      DETECTIONS_PER_IMAGE: 100
      EVAL_PERIOD: 0
      EXPECTED_RESULTS: []
      KEYPOINT_OKS_SIGMAS: []
      PRECISE_BN:
      ENABLED: false
      NUM_ITER: 200
      VERSION: 2
      VIS_PERIOD: 0

@chensnathan
Copy link
Owner

There are several key points about how to modify the settings:

  1. Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.
  2. Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]
  3. The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

@x-x110
Copy link
Author

x-x110 commented May 23, 2021

I will modify these parameters and provide the result. Thank you for your reply

@x-x110
Copy link
Author

x-x110 commented May 24, 2021

After modifying these parameters, 37.39 can be obtained in the iteration times of 30000

@chensnathan
Copy link
Owner

This result is reasonable.

@x-x110
Copy link
Author

x-x110 commented May 24, 2021

thanks you reply

@shenhaibb
Copy link

There are several key points about how to modify the settings:

  1. Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.
  2. Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]
  3. The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

hi, i have only 1gpu(8gb)
i set batch size as 8
learning rate: 0.12 * 8 / 64 = 0.0015
maximum iteration: 22500 * 64 / 8 = 180000
learning rate steps: [15000 * 64 / 8, 20000 * 64 / 48] == [120000, 160000]
warm up iterations: 1500 * 64 / 8 = 12000
warmup factor: 1. / 2000 = 0.0005
Is it the way I calculated it?

@SelimSavas
Copy link

SelimSavas commented May 10, 2023

There are several key points about how to modify the settings:

  1. Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.
  2. Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]
  3. The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

There are several key points about how to modify the settings:

  1. Batch size and learning rate. The default settings are set for 8 GPUs with a total batch size of 64 (8 images per GPU). You have 3 GPUs, and set the batch size as 48 (16 images per GPU). Thus, according to the scaling rule, your learning rate should be 0.12 * 48 / 64 = 0.09.
  2. Training iterations and learning rate steps. We training a maximum iteration of 22500 for batch size 64, for batch size 48, you should also modify the maximum iteration from 22500 to 22500 * 64 / 48 = 30000. For the learning rate steps, they also should be re-calculated according to the rule, [15000 * 64 / 48, 20000 * 64 / 48]
  3. The warmup iterations and the warmup factor. For batch size 64, we warm up the training for 1500 iterations. Thus, for batch size 48, you can modify it from 1500 iterations to 1500 * 64 / 48 = 2000 iterations. And for the warmup factor, it can be obtained by: 1. / 2000.

Is this calculation valid for every data set? Are we going to do the same calculation for datasets of different sizes? @chensnathan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants