[03/22 07:27:27] detectron2 INFO: Rank of current process: 0. World size: 1 [03/22 07:27:28] detectron2 INFO: Environment info: ---------------------- ---------------------------------------------------------------- sys.platform linux Python 3.9.13 (main, May 23 2022, 22:01:06) [GCC 9.4.0] numpy 1.22.4 detectron2 0.6 @/notebooks/detrex/detectron2/detectron2 Compiler GCC 9.4 CUDA compiler CUDA 11.2 detectron2 arch flags 8.6 DETECTRON2_ENV_MODULE PyTorch 1.12.0+cu116 @/usr/local/lib/python3.9/dist-packages/torch PyTorch debug build False GPU available Yes GPU 0 NVIDIA RTX A6000 (arch=8.6) Driver version 510.73.05 CUDA_HOME /usr/local/cuda Pillow 9.2.0 torchvision 0.13.0+cu116 @/usr/local/lib/python3.9/dist-packages/torchvision torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6 fvcore 0.1.5.post20221221 iopath 0.1.9 cv2 4.6.0 ---------------------- ---------------------------------------------------------------- PyTorch built with: - GCC 9.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 11.6 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86 - CuDNN 8.3.2 (built against CUDA 11.5) - Magma 2.6.1 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, [03/22 07:27:28] detectron2 INFO: Command line arguments: Namespace(config_file='projects/dino/configs/dino_r50_corpus.py', resume=False, eval_only=False, num_gpus=1, num_machines=1, machine_rank=0, dist_url='tcp://127.0.0.1:49152', opts=[]) [03/22 07:27:28] detectron2 INFO: Contents of args.config_file=projects/dino/configs/dino_r50_corpus.py: from detrex.config import get_config from .models.dino_r50 import model # get default config dataloader = get_config("common/data/custom.py").dataloader #from detectron2.data.datasets import register_coco_instances #register_coco_instances("corpus", {}, "json_annotation.json", "path/to/image/dir") optimizer = get_config("common/optim.py").AdamW lr_multiplier = get_config("common/coco_schedule.py").lr_multiplier_12ep train = get_config("common/train.py").train # modify training config train.init_checkpoint = "detectron2://ImageNetPretrained/torchvision/R-50.pkl" train.output_dir = "./output/dino_r50_4scale_12ep" # max training iterations train.max_iter = 90000 # run evaluation every 5000 iters train.eval_period = 2000 # log training infomation every 20 iters train.log_period = 50 # save checkpoint every 5000 iters train.checkpointer.period = 2000 # gradient clipping for training train.clip_grad.enabled = True train.clip_grad.params.max_norm = 0.1 train.clip_grad.params.norm_type = 2 # set training devices train.device = "cuda" model.device = train.device # please notice that this is total batch size. # surpose you're using 4 gpus for training and the batch size for # each gpu is 16/4 = 4 dataloader.train.total_batch_size = 14 # modify optimizer config optimizer.lr = 1e-4 * dataloader.train.total_batch_size / 16 optimizer.betas = (0.9, 0.999) optimizer.weight_decay = 1e-4 optimizer.params.lr_factor_func = lambda module_name: 0.1 if "backbone" in module_name else 1 # modify dataloader config dataloader.train.num_workers = 8 # dump the testing results into output_dir for visualization dataloader.evaluator.output_dir = train.output_dir [03/22 07:27:28] d2.config.lazy WARNING: The config contains objects that cannot serialize to a valid yaml. ./output/dino_r50_4scale_12ep/config.yaml is human-readable but cannot be loaded. [03/22 07:27:28] d2.config.lazy WARNING: Config is saved using cloudpickle at ./output/dino_r50_4scale_12ep/config.yaml.pkl. [03/22 07:27:28] detectron2 INFO: Full config saved to ./output/dino_r50_4scale_12ep/config.yaml [03/22 07:27:28] d2.utils.env INFO: Using a generated random seed 28302002 [03/22 07:27:31] detectron2 INFO: Model: DINO( (backbone): ResNet( (stem): BasicStem( (conv1): Conv2d( 3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) ) (res2): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv1): Conv2d( 64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv2): Conv2d( 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv3): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv2): Conv2d( 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv3): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv2): Conv2d( 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv3): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) ) ) (res3): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv1): Conv2d( 256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) (3): BottleneckBlock( (conv1): Conv2d( 512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) ) (res4): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) (conv1): Conv2d( 512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (3): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (4): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (5): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) ) (res5): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) (conv1): Conv2d( 1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv2): Conv2d( 512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv3): Conv2d( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv2): Conv2d( 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv3): Conv2d( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv2): Conv2d( 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv3): Conv2d( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) ) ) ) (position_embedding): PositionEmbeddingSine() (neck): ChannelMapper( (convs): ModuleList( (0): ConvNormAct( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) (1): ConvNormAct( (conv): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) (2): ConvNormAct( (conv): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) ) (extra_convs): ModuleList( (0): ConvNormAct( (conv): Conv2d(2048, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) ) ) (transformer): DINOTransformer( (encoder): DINOTransformerEncoder( (layers): ModuleList( (0): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (1): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (2): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (3): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (4): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (5): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) ) ) (decoder): DINOTransformerDecoder( (layers): ModuleList( (0): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (1): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (2): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (3): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (4): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (5): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) ) (ref_point_head): MLP( (layers): ModuleList( (0): Linear(in_features=512, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) ) ) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (class_embed): ModuleList( (0): Linear(in_features=256, out_features=1, bias=True) (1): Linear(in_features=256, out_features=1, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) (3): Linear(in_features=256, out_features=1, bias=True) (4): Linear(in_features=256, out_features=1, bias=True) (5): Linear(in_features=256, out_features=1, bias=True) (6): Linear(in_features=256, out_features=1, bias=True) ) (bbox_embed): ModuleList( (0): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (1): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (2): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (3): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (4): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (5): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (6): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) ) ) (tgt_embed): Embedding(900, 256) (enc_output): Linear(in_features=256, out_features=256, bias=True) (enc_output_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (class_embed): ModuleList( (0): Linear(in_features=256, out_features=1, bias=True) (1): Linear(in_features=256, out_features=1, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) (3): Linear(in_features=256, out_features=1, bias=True) (4): Linear(in_features=256, out_features=1, bias=True) (5): Linear(in_features=256, out_features=1, bias=True) (6): Linear(in_features=256, out_features=1, bias=True) ) (bbox_embed): ModuleList( (0): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (1): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (2): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (3): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (4): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (5): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (6): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) ) (criterion): Criterion DINOCriterion matcher: Matcher HungarianMatcher cost_class: 2.0 cost_bbox: 5.0 cost_giou: 2.0 cost_class_type: focal_loss_cost focal cost alpha: 0.25 focal cost gamma: 2.0 losses: ['class', 'boxes'] loss_class_type: focal_loss weight_dict: {'loss_class': 1, 'loss_bbox': 5.0, 'loss_giou': 2.0, 'loss_class_dn': 1, 'loss_bbox_dn': 5.0, 'loss_giou_dn': 2.0, 'loss_class_enc': 1, 'loss_bbox_enc': 5.0, 'loss_giou_enc': 2.0, 'loss_class_dn_enc': 1, 'loss_bbox_dn_enc': 5.0, 'loss_giou_dn_enc': 2.0, 'loss_class_0': 1, 'loss_bbox_0': 5.0, 'loss_giou_0': 2.0, 'loss_class_dn_0': 1, 'loss_bbox_dn_0': 5.0, 'loss_giou_dn_0': 2.0, 'loss_class_1': 1, 'loss_bbox_1': 5.0, 'loss_giou_1': 2.0, 'loss_class_dn_1': 1, 'loss_bbox_dn_1': 5.0, 'loss_giou_dn_1': 2.0, 'loss_class_2': 1, 'loss_bbox_2': 5.0, 'loss_giou_2': 2.0, 'loss_class_dn_2': 1, 'loss_bbox_dn_2': 5.0, 'loss_giou_dn_2': 2.0, 'loss_class_3': 1, 'loss_bbox_3': 5.0, 'loss_giou_3': 2.0, 'loss_class_dn_3': 1, 'loss_bbox_dn_3': 5.0, 'loss_giou_dn_3': 2.0, 'loss_class_4': 1, 'loss_bbox_4': 5.0, 'loss_giou_4': 2.0, 'loss_class_dn_4': 1, 'loss_bbox_dn_4': 5.0, 'loss_giou_dn_4': 2.0} num_classes: 1 eos_coef: None focal loss alpha: 0.25 focal loss gamma: 2.0 (label_enc): Embedding(1, 256) ) [03/22 07:28:40] detectron2 INFO: Rank of current process: 0. World size: 1 [03/22 07:28:41] detectron2 INFO: Environment info: ---------------------- ---------------------------------------------------------------- sys.platform linux Python 3.9.13 (main, May 23 2022, 22:01:06) [GCC 9.4.0] numpy 1.22.4 detectron2 0.6 @/notebooks/detrex/detectron2/detectron2 Compiler GCC 9.4 CUDA compiler CUDA 11.2 detectron2 arch flags 8.6 DETECTRON2_ENV_MODULE PyTorch 1.12.0+cu116 @/usr/local/lib/python3.9/dist-packages/torch PyTorch debug build False GPU available Yes GPU 0 NVIDIA RTX A6000 (arch=8.6) Driver version 510.73.05 CUDA_HOME /usr/local/cuda Pillow 9.2.0 torchvision 0.13.0+cu116 @/usr/local/lib/python3.9/dist-packages/torchvision torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6 fvcore 0.1.5.post20221221 iopath 0.1.9 cv2 4.6.0 ---------------------- ---------------------------------------------------------------- PyTorch built with: - GCC 9.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 11.6 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86 - CuDNN 8.3.2 (built against CUDA 11.5) - Magma 2.6.1 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, [03/22 07:28:41] detectron2 INFO: Command line arguments: Namespace(config_file='projects/dino/configs/dino_r50_corpus.py', resume=False, eval_only=False, num_gpus=1, num_machines=1, machine_rank=0, dist_url='tcp://127.0.0.1:49152', opts=[]) [03/22 07:28:41] detectron2 INFO: Contents of args.config_file=projects/dino/configs/dino_r50_corpus.py: from detrex.config import get_config from .models.dino_r50 import model # get default config dataloader = get_config("common/data/custom.py").dataloader #from detectron2.data.datasets import register_coco_instances #register_coco_instances("corpus", {}, "json_annotation.json", "path/to/image/dir") optimizer = get_config("common/optim.py").AdamW lr_multiplier = get_config("common/coco_schedule.py").lr_multiplier_12ep train = get_config("common/train.py").train # modify training config train.init_checkpoint = "detectron2://ImageNetPretrained/torchvision/R-50.pkl" train.output_dir = "./output/dino_r50_4scale_12ep" # max training iterations train.max_iter = 90000 # run evaluation every 5000 iters train.eval_period = 2000 # log training infomation every 20 iters train.log_period = 50 # save checkpoint every 5000 iters train.checkpointer.period = 2000 # gradient clipping for training train.clip_grad.enabled = True train.clip_grad.params.max_norm = 0.1 train.clip_grad.params.norm_type = 2 # set training devices train.device = "cuda" model.device = train.device # please notice that this is total batch size. # surpose you're using 4 gpus for training and the batch size for # each gpu is 16/4 = 4 dataloader.train.total_batch_size = 14 # modify optimizer config optimizer.lr = 1e-4 * dataloader.train.total_batch_size / 16 optimizer.betas = (0.9, 0.999) optimizer.weight_decay = 1e-4 optimizer.params.lr_factor_func = lambda module_name: 0.1 if "backbone" in module_name else 1 # modify dataloader config dataloader.train.num_workers = 8 # dump the testing results into output_dir for visualization dataloader.evaluator.output_dir = train.output_dir [03/22 07:28:41] d2.config.lazy WARNING: The config contains objects that cannot serialize to a valid yaml. ./output/dino_r50_4scale_12ep/config.yaml is human-readable but cannot be loaded. [03/22 07:28:41] d2.config.lazy WARNING: Config is saved using cloudpickle at ./output/dino_r50_4scale_12ep/config.yaml.pkl. [03/22 07:28:41] detectron2 INFO: Full config saved to ./output/dino_r50_4scale_12ep/config.yaml [03/22 07:28:41] d2.utils.env INFO: Using a generated random seed 41123822 [03/22 07:28:42] detectron2 INFO: Model: DINO( (backbone): ResNet( (stem): BasicStem( (conv1): Conv2d( 3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) ) (res2): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv1): Conv2d( 64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv2): Conv2d( 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv3): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv2): Conv2d( 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv3): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv2): Conv2d( 64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05) ) (conv3): Conv2d( 64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) ) ) (res3): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv1): Conv2d( 256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) (3): BottleneckBlock( (conv1): Conv2d( 512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv2): Conv2d( 128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05) ) (conv3): Conv2d( 128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) ) ) (res4): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) (conv1): Conv2d( 512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (3): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (4): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) (5): BottleneckBlock( (conv1): Conv2d( 1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv2): Conv2d( 256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05) ) (conv3): Conv2d( 256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05) ) ) ) (res5): Sequential( (0): BottleneckBlock( (shortcut): Conv2d( 1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) (conv1): Conv2d( 1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv2): Conv2d( 512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv3): Conv2d( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) ) (1): BottleneckBlock( (conv1): Conv2d( 2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv2): Conv2d( 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv3): Conv2d( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) ) (2): BottleneckBlock( (conv1): Conv2d( 2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv2): Conv2d( 512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05) ) (conv3): Conv2d( 512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05) ) ) ) ) (position_embedding): PositionEmbeddingSine() (neck): ChannelMapper( (convs): ModuleList( (0): ConvNormAct( (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) (1): ConvNormAct( (conv): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) (2): ConvNormAct( (conv): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) ) (extra_convs): ModuleList( (0): ConvNormAct( (conv): Conv2d(2048, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (norm): GroupNorm(32, 256, eps=1e-05, affine=True) ) ) ) (transformer): DINOTransformer( (encoder): DINOTransformerEncoder( (layers): ModuleList( (0): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (1): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (2): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (3): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (4): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (5): BaseTransformerLayer( (attentions): ModuleList( (0): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) ) ) (decoder): DINOTransformerDecoder( (layers): ModuleList( (0): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (1): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (2): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (3): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (4): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) (5): BaseTransformerLayer( (attentions): ModuleList( (0): MultiheadAttention( (attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=256, out_features=256, bias=True) ) (proj_drop): Dropout(p=0.0, inplace=False) ) (1): MultiScaleDeformableAttention( (dropout): Dropout(p=0.0, inplace=False) (sampling_offsets): Linear(in_features=256, out_features=256, bias=True) (attention_weights): Linear(in_features=256, out_features=128, bias=True) (value_proj): Linear(in_features=256, out_features=256, bias=True) (output_proj): Linear(in_features=256, out_features=256, bias=True) ) ) (ffns): ModuleList( (0): FFN( (activation): ReLU(inplace=True) (layers): Sequential( (0): Sequential( (0): Linear(in_features=256, out_features=2048, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.0, inplace=False) ) (1): Linear(in_features=2048, out_features=256, bias=True) (2): Dropout(p=0.0, inplace=False) ) ) ) (norms): ModuleList( (0): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) ) ) (ref_point_head): MLP( (layers): ModuleList( (0): Linear(in_features=512, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) ) ) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (class_embed): ModuleList( (0): Linear(in_features=256, out_features=1, bias=True) (1): Linear(in_features=256, out_features=1, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) (3): Linear(in_features=256, out_features=1, bias=True) (4): Linear(in_features=256, out_features=1, bias=True) (5): Linear(in_features=256, out_features=1, bias=True) (6): Linear(in_features=256, out_features=1, bias=True) ) (bbox_embed): ModuleList( (0): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (1): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (2): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (3): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (4): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (5): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (6): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) ) ) (tgt_embed): Embedding(900, 256) (enc_output): Linear(in_features=256, out_features=256, bias=True) (enc_output_norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (class_embed): ModuleList( (0): Linear(in_features=256, out_features=1, bias=True) (1): Linear(in_features=256, out_features=1, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) (3): Linear(in_features=256, out_features=1, bias=True) (4): Linear(in_features=256, out_features=1, bias=True) (5): Linear(in_features=256, out_features=1, bias=True) (6): Linear(in_features=256, out_features=1, bias=True) ) (bbox_embed): ModuleList( (0): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (1): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (2): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (3): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (4): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (5): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) (6): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=256, bias=True) (1): Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) ) ) (criterion): Criterion DINOCriterion matcher: Matcher HungarianMatcher cost_class: 2.0 cost_bbox: 5.0 cost_giou: 2.0 cost_class_type: focal_loss_cost focal cost alpha: 0.25 focal cost gamma: 2.0 losses: ['class', 'boxes'] loss_class_type: focal_loss weight_dict: {'loss_class': 1, 'loss_bbox': 5.0, 'loss_giou': 2.0, 'loss_class_dn': 1, 'loss_bbox_dn': 5.0, 'loss_giou_dn': 2.0, 'loss_class_enc': 1, 'loss_bbox_enc': 5.0, 'loss_giou_enc': 2.0, 'loss_class_dn_enc': 1, 'loss_bbox_dn_enc': 5.0, 'loss_giou_dn_enc': 2.0, 'loss_class_0': 1, 'loss_bbox_0': 5.0, 'loss_giou_0': 2.0, 'loss_class_dn_0': 1, 'loss_bbox_dn_0': 5.0, 'loss_giou_dn_0': 2.0, 'loss_class_1': 1, 'loss_bbox_1': 5.0, 'loss_giou_1': 2.0, 'loss_class_dn_1': 1, 'loss_bbox_dn_1': 5.0, 'loss_giou_dn_1': 2.0, 'loss_class_2': 1, 'loss_bbox_2': 5.0, 'loss_giou_2': 2.0, 'loss_class_dn_2': 1, 'loss_bbox_dn_2': 5.0, 'loss_giou_dn_2': 2.0, 'loss_class_3': 1, 'loss_bbox_3': 5.0, 'loss_giou_3': 2.0, 'loss_class_dn_3': 1, 'loss_bbox_dn_3': 5.0, 'loss_giou_dn_3': 2.0, 'loss_class_4': 1, 'loss_bbox_4': 5.0, 'loss_giou_4': 2.0, 'loss_class_dn_4': 1, 'loss_bbox_dn_4': 5.0, 'loss_giou_dn_4': 2.0} num_classes: 1 eos_coef: None focal loss alpha: 0.25 focal loss gamma: 2.0 (label_enc): Embedding(1, 256) ) [03/22 07:28:43] d2.data.datasets.coco WARNING: Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you. [03/22 07:28:43] d2.data.datasets.coco INFO: Loaded 6245 images in COCO format from datasets/coab/annotations/train.json [03/22 07:28:44] d2.data.build INFO: Removed 0 images with no usable annotations. 6245 images left. [03/22 07:28:44] d2.data.build INFO: Distribution of instances among all 1 categories: | category | #instances | |:----------:|:-------------| | object | 131789 | | | | [03/22 07:28:44] d2.data.common INFO: Serializing 6245 elements to byte tensors and concatenating them all ... [03/22 07:28:44] d2.data.common INFO: Serialized dataset takes 40.50 MiB [03/22 07:28:50] fvcore.common.checkpoint INFO: [Checkpointer] Loading from detectron2://ImageNetPretrained/torchvision/R-50.pkl ... [03/22 07:28:51] fvcore.common.checkpoint INFO: Reading a file from 'torchvision' [03/22 07:28:51] d2.checkpoint.c2_model_loading INFO: Following weights matched with submodule backbone: | Names in Model | Names in Checkpoint | Shapes | |:------------------|:----------------------------------------------------------------------------------|:------------------------------------------------| | res2.0.conv1.* | res2.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,1,1) | | res2.0.conv2.* | res2.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,3,3) | | res2.0.conv3.* | res2.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) | | res2.0.shortcut.* | res2.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) | | res2.1.conv1.* | res2.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,256,1,1) | | res2.1.conv2.* | res2.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,3,3) | | res2.1.conv3.* | res2.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) | | res2.2.conv1.* | res2.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,256,1,1) | | res2.2.conv2.* | res2.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,64,3,3) | | res2.2.conv3.* | res2.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,64,1,1) | | res3.0.conv1.* | res3.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,256,1,1) | | res3.0.conv2.* | res3.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) | | res3.0.conv3.* | res3.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) | | res3.0.shortcut.* | res3.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,256,1,1) | | res3.1.conv1.* | res3.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,512,1,1) | | res3.1.conv2.* | res3.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) | | res3.1.conv3.* | res3.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) | | res3.2.conv1.* | res3.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,512,1,1) | | res3.2.conv2.* | res3.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) | | res3.2.conv3.* | res3.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) | | res3.3.conv1.* | res3.3.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,512,1,1) | | res3.3.conv2.* | res3.3.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (128,) (128,) (128,) (128,) (128,128,3,3) | | res3.3.conv3.* | res3.3.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,128,1,1) | | res4.0.conv1.* | res4.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,512,1,1) | | res4.0.conv2.* | res4.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) | | res4.0.conv3.* | res4.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) | | res4.0.shortcut.* | res4.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,512,1,1) | | res4.1.conv1.* | res4.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) | | res4.1.conv2.* | res4.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) | | res4.1.conv3.* | res4.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) | | res4.2.conv1.* | res4.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) | | res4.2.conv2.* | res4.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) | | res4.2.conv3.* | res4.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) | | res4.3.conv1.* | res4.3.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) | | res4.3.conv2.* | res4.3.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) | | res4.3.conv3.* | res4.3.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) | | res4.4.conv1.* | res4.4.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) | | res4.4.conv2.* | res4.4.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) | | res4.4.conv3.* | res4.4.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) | | res4.5.conv1.* | res4.5.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,1024,1,1) | | res4.5.conv2.* | res4.5.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (256,) (256,) (256,) (256,) (256,256,3,3) | | res4.5.conv3.* | res4.5.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (1024,) (1024,) (1024,) (1024,) (1024,256,1,1) | | res5.0.conv1.* | res5.0.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,1024,1,1) | | res5.0.conv2.* | res5.0.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,512,3,3) | | res5.0.conv3.* | res5.0.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,512,1,1) | | res5.0.shortcut.* | res5.0.shortcut.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,1024,1,1) | | res5.1.conv1.* | res5.1.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,2048,1,1) | | res5.1.conv2.* | res5.1.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,512,3,3) | | res5.1.conv3.* | res5.1.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,512,1,1) | | res5.2.conv1.* | res5.2.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,2048,1,1) | | res5.2.conv2.* | res5.2.conv2.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (512,) (512,) (512,) (512,) (512,512,3,3) | | res5.2.conv3.* | res5.2.conv3.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (2048,) (2048,) (2048,) (2048,) (2048,512,1,1) | | stem.conv1.* | stem.conv1.{norm.bias,norm.running_mean,norm.running_var,norm.weight,weight} | (64,) (64,) (64,) (64,) (64,3,7,7) | [03/22 07:28:52] fvcore.common.checkpoint WARNING: Some model parameters or buffers are not found in the checkpoint: bbox_embed.0.layers.0.{bias, weight} bbox_embed.0.layers.1.{bias, weight} bbox_embed.0.layers.2.{bias, weight} bbox_embed.1.layers.0.{bias, weight} bbox_embed.1.layers.1.{bias, weight} bbox_embed.1.layers.2.{bias, weight} bbox_embed.2.layers.0.{bias, weight} bbox_embed.2.layers.1.{bias, weight} bbox_embed.2.layers.2.{bias, weight} bbox_embed.3.layers.0.{bias, weight} bbox_embed.3.layers.1.{bias, weight} bbox_embed.3.layers.2.{bias, weight} bbox_embed.4.layers.0.{bias, weight} bbox_embed.4.layers.1.{bias, weight} bbox_embed.4.layers.2.{bias, weight} bbox_embed.5.layers.0.{bias, weight} bbox_embed.5.layers.1.{bias, weight} bbox_embed.5.layers.2.{bias, weight} bbox_embed.6.layers.0.{bias, weight} bbox_embed.6.layers.1.{bias, weight} bbox_embed.6.layers.2.{bias, weight} class_embed.0.{bias, weight} class_embed.1.{bias, weight} class_embed.2.{bias, weight} class_embed.3.{bias, weight} class_embed.4.{bias, weight} class_embed.5.{bias, weight} class_embed.6.{bias, weight} label_enc.weight neck.convs.0.conv.{bias, weight} neck.convs.0.norm.{bias, weight} neck.convs.1.conv.{bias, weight} neck.convs.1.norm.{bias, weight} neck.convs.2.conv.{bias, weight} neck.convs.2.norm.{bias, weight} neck.extra_convs.0.conv.{bias, weight} neck.extra_convs.0.norm.{bias, weight} transformer.decoder.bbox_embed.0.layers.0.{bias, weight} transformer.decoder.bbox_embed.0.layers.1.{bias, weight} transformer.decoder.bbox_embed.0.layers.2.{bias, weight} transformer.decoder.bbox_embed.1.layers.0.{bias, weight} transformer.decoder.bbox_embed.1.layers.1.{bias, weight} transformer.decoder.bbox_embed.1.layers.2.{bias, weight} transformer.decoder.bbox_embed.2.layers.0.{bias, weight} transformer.decoder.bbox_embed.2.layers.1.{bias, weight} transformer.decoder.bbox_embed.2.layers.2.{bias, weight} transformer.decoder.bbox_embed.3.layers.0.{bias, weight} transformer.decoder.bbox_embed.3.layers.1.{bias, weight} transformer.decoder.bbox_embed.3.layers.2.{bias, weight} transformer.decoder.bbox_embed.4.layers.0.{bias, weight} transformer.decoder.bbox_embed.4.layers.1.{bias, weight} transformer.decoder.bbox_embed.4.layers.2.{bias, weight} transformer.decoder.bbox_embed.5.layers.0.{bias, weight} transformer.decoder.bbox_embed.5.layers.1.{bias, weight} transformer.decoder.bbox_embed.5.layers.2.{bias, weight} transformer.decoder.bbox_embed.6.layers.0.{bias, weight} transformer.decoder.bbox_embed.6.layers.1.{bias, weight} transformer.decoder.bbox_embed.6.layers.2.{bias, weight} transformer.decoder.class_embed.0.{bias, weight} transformer.decoder.class_embed.1.{bias, weight} transformer.decoder.class_embed.2.{bias, weight} transformer.decoder.class_embed.3.{bias, weight} transformer.decoder.class_embed.4.{bias, weight} transformer.decoder.class_embed.5.{bias, weight} transformer.decoder.class_embed.6.{bias, weight} transformer.decoder.layers.0.attentions.0.attn.out_proj.{bias, weight} transformer.decoder.layers.0.attentions.0.attn.{in_proj_bias, in_proj_weight} transformer.decoder.layers.0.attentions.1.attention_weights.{bias, weight} transformer.decoder.layers.0.attentions.1.output_proj.{bias, weight} transformer.decoder.layers.0.attentions.1.sampling_offsets.{bias, weight} transformer.decoder.layers.0.attentions.1.value_proj.{bias, weight} transformer.decoder.layers.0.ffns.0.layers.0.0.{bias, weight} transformer.decoder.layers.0.ffns.0.layers.1.{bias, weight} transformer.decoder.layers.0.norms.0.{bias, weight} transformer.decoder.layers.0.norms.1.{bias, weight} transformer.decoder.layers.0.norms.2.{bias, weight} transformer.decoder.layers.1.attentions.0.attn.out_proj.{bias, weight} transformer.decoder.layers.1.attentions.0.attn.{in_proj_bias, in_proj_weight} transformer.decoder.layers.1.attentions.1.attention_weights.{bias, weight} transformer.decoder.layers.1.attentions.1.output_proj.{bias, weight} transformer.decoder.layers.1.attentions.1.sampling_offsets.{bias, weight} transformer.decoder.layers.1.attentions.1.value_proj.{bias, weight} transformer.decoder.layers.1.ffns.0.layers.0.0.{bias, weight} transformer.decoder.layers.1.ffns.0.layers.1.{bias, weight} transformer.decoder.layers.1.norms.0.{bias, weight} transformer.decoder.layers.1.norms.1.{bias, weight} transformer.decoder.layers.1.norms.2.{bias, weight} transformer.decoder.layers.2.attentions.0.attn.out_proj.{bias, weight} transformer.decoder.layers.2.attentions.0.attn.{in_proj_bias, in_proj_weight} transformer.decoder.layers.2.attentions.1.attention_weights.{bias, weight} transformer.decoder.layers.2.attentions.1.output_proj.{bias, weight} transformer.decoder.layers.2.attentions.1.sampling_offsets.{bias, weight} transformer.decoder.layers.2.attentions.1.value_proj.{bias, weight} transformer.decoder.layers.2.ffns.0.layers.0.0.{bias, weight} transformer.decoder.layers.2.ffns.0.layers.1.{bias, weight} transformer.decoder.layers.2.norms.0.{bias, weight} transformer.decoder.layers.2.norms.1.{bias, weight} transformer.decoder.layers.2.norms.2.{bias, weight} transformer.decoder.layers.3.attentions.0.attn.out_proj.{bias, weight} transformer.decoder.layers.3.attentions.0.attn.{in_proj_bias, in_proj_weight} transformer.decoder.layers.3.attentions.1.attention_weights.{bias, weight} transformer.decoder.layers.3.attentions.1.output_proj.{bias, weight} transformer.decoder.layers.3.attentions.1.sampling_offsets.{bias, weight} transformer.decoder.layers.3.attentions.1.value_proj.{bias, weight} transformer.decoder.layers.3.ffns.0.layers.0.0.{bias, weight} transformer.decoder.layers.3.ffns.0.layers.1.{bias, weight} transformer.decoder.layers.3.norms.0.{bias, weight} transformer.decoder.layers.3.norms.1.{bias, weight} transformer.decoder.layers.3.norms.2.{bias, weight} transformer.decoder.layers.4.attentions.0.attn.out_proj.{bias, weight} transformer.decoder.layers.4.attentions.0.attn.{in_proj_bias, in_proj_weight} transformer.decoder.layers.4.attentions.1.attention_weights.{bias, weight} transformer.decoder.layers.4.attentions.1.output_proj.{bias, weight} transformer.decoder.layers.4.attentions.1.sampling_offsets.{bias, weight} transformer.decoder.layers.4.attentions.1.value_proj.{bias, weight} transformer.decoder.layers.4.ffns.0.layers.0.0.{bias, weight} transformer.decoder.layers.4.ffns.0.layers.1.{bias, weight} transformer.decoder.layers.4.norms.0.{bias, weight} transformer.decoder.layers.4.norms.1.{bias, weight} transformer.decoder.layers.4.norms.2.{bias, weight} transformer.decoder.layers.5.attentions.0.attn.out_proj.{bias, weight} transformer.decoder.layers.5.attentions.0.attn.{in_proj_bias, in_proj_weight} transformer.decoder.layers.5.attentions.1.attention_weights.{bias, weight} transformer.decoder.layers.5.attentions.1.output_proj.{bias, weight} transformer.decoder.layers.5.attentions.1.sampling_offsets.{bias, weight} transformer.decoder.layers.5.attentions.1.value_proj.{bias, weight} transformer.decoder.layers.5.ffns.0.layers.0.0.{bias, weight} transformer.decoder.layers.5.ffns.0.layers.1.{bias, weight} transformer.decoder.layers.5.norms.0.{bias, weight} transformer.decoder.layers.5.norms.1.{bias, weight} transformer.decoder.layers.5.norms.2.{bias, weight} transformer.decoder.norm.{bias, weight} transformer.decoder.ref_point_head.layers.0.{bias, weight} transformer.decoder.ref_point_head.layers.1.{bias, weight} transformer.enc_output.{bias, weight} transformer.enc_output_norm.{bias, weight} transformer.encoder.layers.0.attentions.0.attention_weights.{bias, weight} transformer.encoder.layers.0.attentions.0.output_proj.{bias, weight} transformer.encoder.layers.0.attentions.0.sampling_offsets.{bias, weight} transformer.encoder.layers.0.attentions.0.value_proj.{bias, weight} transformer.encoder.layers.0.ffns.0.layers.0.0.{bias, weight} transformer.encoder.layers.0.ffns.0.layers.1.{bias, weight} transformer.encoder.layers.0.norms.0.{bias, weight} transformer.encoder.layers.0.norms.1.{bias, weight} transformer.encoder.layers.1.attentions.0.attention_weights.{bias, weight} transformer.encoder.layers.1.attentions.0.output_proj.{bias, weight} transformer.encoder.layers.1.attentions.0.sampling_offsets.{bias, weight} transformer.encoder.layers.1.attentions.0.value_proj.{bias, weight} transformer.encoder.layers.1.ffns.0.layers.0.0.{bias, weight} transformer.encoder.layers.1.ffns.0.layers.1.{bias, weight} transformer.encoder.layers.1.norms.0.{bias, weight} transformer.encoder.layers.1.norms.1.{bias, weight} transformer.encoder.layers.2.attentions.0.attention_weights.{bias, weight} transformer.encoder.layers.2.attentions.0.output_proj.{bias, weight} transformer.encoder.layers.2.attentions.0.sampling_offsets.{bias, weight} transformer.encoder.layers.2.attentions.0.value_proj.{bias, weight} transformer.encoder.layers.2.ffns.0.layers.0.0.{bias, weight} transformer.encoder.layers.2.ffns.0.layers.1.{bias, weight} transformer.encoder.layers.2.norms.0.{bias, weight} transformer.encoder.layers.2.norms.1.{bias, weight} transformer.encoder.layers.3.attentions.0.attention_weights.{bias, weight} transformer.encoder.layers.3.attentions.0.output_proj.{bias, weight} transformer.encoder.layers.3.attentions.0.sampling_offsets.{bias, weight} transformer.encoder.layers.3.attentions.0.value_proj.{bias, weight} transformer.encoder.layers.3.ffns.0.layers.0.0.{bias, weight} transformer.encoder.layers.3.ffns.0.layers.1.{bias, weight} transformer.encoder.layers.3.norms.0.{bias, weight} transformer.encoder.layers.3.norms.1.{bias, weight} transformer.encoder.layers.4.attentions.0.attention_weights.{bias, weight} transformer.encoder.layers.4.attentions.0.output_proj.{bias, weight} transformer.encoder.layers.4.attentions.0.sampling_offsets.{bias, weight} transformer.encoder.layers.4.attentions.0.value_proj.{bias, weight} transformer.encoder.layers.4.ffns.0.layers.0.0.{bias, weight} transformer.encoder.layers.4.ffns.0.layers.1.{bias, weight} transformer.encoder.layers.4.norms.0.{bias, weight} transformer.encoder.layers.4.norms.1.{bias, weight} transformer.encoder.layers.5.attentions.0.attention_weights.{bias, weight} transformer.encoder.layers.5.attentions.0.output_proj.{bias, weight} transformer.encoder.layers.5.attentions.0.sampling_offsets.{bias, weight} transformer.encoder.layers.5.attentions.0.value_proj.{bias, weight} transformer.encoder.layers.5.ffns.0.layers.0.0.{bias, weight} transformer.encoder.layers.5.ffns.0.layers.1.{bias, weight} transformer.encoder.layers.5.norms.0.{bias, weight} transformer.encoder.layers.5.norms.1.{bias, weight} transformer.level_embeds transformer.tgt_embed.weight [03/22 07:28:52] fvcore.common.checkpoint WARNING: The checkpoint state_dict contains keys that are not used by the model: stem.fc.{bias, weight} [03/22 07:28:52] d2.engine.train_loop INFO: Starting training from iteration 0 [03/22 07:30:26] d2.utils.events INFO: eta: 1 day, 18:04:12 iter: 49 total_loss: 26.75 loss_class: 0.4503 loss_bbox: 0.4487 loss_giou: 1.186 loss_class_0: 0.4068 loss_bbox_0: 0.4764 loss_giou_0: 1.212 loss_class_1: 0.4209 loss_bbox_1: 0.4652 loss_giou_1: 1.203 loss_class_2: 0.4457 loss_bbox_2: 0.4601 loss_giou_2: 1.196 loss_class_3: 0.4412 loss_bbox_3: 0.4597 loss_giou_3: 1.195 loss_class_4: 0.4467 loss_bbox_4: 0.4548 loss_giou_4: 1.19 loss_class_enc: 0.4594 loss_bbox_enc: 0.4419 loss_giou_enc: 1.182 loss_class_dn: 0.1978 loss_bbox_dn: 0.47 loss_giou_dn: 1.395 loss_class_dn_0: 0.1444 loss_bbox_dn_0: 0.4745 loss_giou_dn_0: 1.388 loss_class_dn_1: 0.144 loss_bbox_dn_1: 0.4733 loss_giou_dn_1: 1.389 loss_class_dn_2: 0.1483 loss_bbox_dn_2: 0.4721 loss_giou_dn_2: 1.391 loss_class_dn_3: 0.1453 loss_bbox_dn_3: 0.4709 loss_giou_dn_3: 1.392 loss_class_dn_4: 0.1537 loss_bbox_dn_4: 0.4705 loss_giou_dn_4: 1.393 time: 1.6923 data_time: 0.0952 lr: 8.75e-05 max_mem: 35282M [03/22 07:31:51] d2.utils.events INFO: eta: 1 day, 18:18:43 iter: 99 total_loss: 26.09 loss_class: 0.4 loss_bbox: 0.4397 loss_giou: 1.128 loss_class_0: 0.3626 loss_bbox_0: 0.4606 loss_giou_0: 1.166 loss_class_1: 0.3666 loss_bbox_1: 0.4527 loss_giou_1: 1.157 loss_class_2: 0.3737 loss_bbox_2: 0.4479 loss_giou_2: 1.153 loss_class_3: 0.3744 loss_bbox_3: 0.4461 loss_giou_3: 1.149 loss_class_4: 0.3836 loss_bbox_4: 0.4419 loss_giou_4: 1.138 loss_class_enc: 0.4218 loss_bbox_enc: 0.4254 loss_giou_enc: 1.101 loss_class_dn: 0.1451 loss_bbox_dn: 0.4572 loss_giou_dn: 1.403 loss_class_dn_0: 0.1417 loss_bbox_dn_0: 0.4626 loss_giou_dn_0: 1.396 loss_class_dn_1: 0.1418 loss_bbox_dn_1: 0.4597 loss_giou_dn_1: 1.399 loss_class_dn_2: 0.1422 loss_bbox_dn_2: 0.4584 loss_giou_dn_2: 1.401 loss_class_dn_3: 0.1438 loss_bbox_dn_3: 0.4579 loss_giou_dn_3: 1.402 loss_class_dn_4: 0.1425 loss_bbox_dn_4: 0.4574 loss_giou_dn_4: 1.402 time: 1.6989 data_time: 0.0961 lr: 8.75e-05 max_mem: 35758M [03/22 07:33:17] d2.utils.events INFO: eta: 1 day, 18:01:24 iter: 149 total_loss: 24.12 loss_class: 0.3068 loss_bbox: 0.3711 loss_giou: 1.027 loss_class_0: 0.3028 loss_bbox_0: 0.3876 loss_giou_0: 1.074 loss_class_1: 0.3007 loss_bbox_1: 0.3885 loss_giou_1: 1.072 loss_class_2: 0.2982 loss_bbox_2: 0.3815 loss_giou_2: 1.057 loss_class_3: 0.298 loss_bbox_3: 0.3792 loss_giou_3: 1.049 loss_class_4: 0.2969 loss_bbox_4: 0.377 loss_giou_4: 1.042 loss_class_enc: 0.3223 loss_bbox_enc: 0.3658 loss_giou_enc: 1.048 loss_class_dn: 0.1322 loss_bbox_dn: 0.4521 loss_giou_dn: 1.376 loss_class_dn_0: 0.1299 loss_bbox_dn_0: 0.4592 loss_giou_dn_0: 1.386 loss_class_dn_1: 0.1293 loss_bbox_dn_1: 0.4554 loss_giou_dn_1: 1.38 loss_class_dn_2: 0.128 loss_bbox_dn_2: 0.453 loss_giou_dn_2: 1.382 loss_class_dn_3: 0.1285 loss_bbox_dn_3: 0.4514 loss_giou_dn_3: 1.38 loss_class_dn_4: 0.1317 loss_bbox_dn_4: 0.4516 loss_giou_dn_4: 1.377 time: 1.7029 data_time: 0.0931 lr: 8.75e-05 max_mem: 35758M [03/22 07:34:42] d2.utils.events INFO: eta: 1 day, 18:01:32 iter: 199 total_loss: 21.94 loss_class: 0.2605 loss_bbox: 0.3484 loss_giou: 0.9471 loss_class_0: 0.2429 loss_bbox_0: 0.3715 loss_giou_0: 0.9898 loss_class_1: 0.2378 loss_bbox_1: 0.367 loss_giou_1: 0.9847 loss_class_2: 0.2398 loss_bbox_2: 0.3593 loss_giou_2: 0.9778 loss_class_3: 0.2433 loss_bbox_3: 0.3544 loss_giou_3: 0.9608 loss_class_4: 0.2504 loss_bbox_4: 0.3512 loss_giou_4: 0.956 loss_class_enc: 0.2819 loss_bbox_enc: 0.3626 loss_giou_enc: 1.007 loss_class_dn: 0.1193 loss_bbox_dn: 0.411 loss_giou_dn: 1.286 loss_class_dn_0: 0.1322 loss_bbox_dn_0: 0.4298 loss_giou_dn_0: 1.301 loss_class_dn_1: 0.1223 loss_bbox_dn_1: 0.4186 loss_giou_dn_1: 1.287 loss_class_dn_2: 0.1208 loss_bbox_dn_2: 0.413 loss_giou_dn_2: 1.285 loss_class_dn_3: 0.117 loss_bbox_dn_3: 0.4099 loss_giou_dn_3: 1.291 loss_class_dn_4: 0.1192 loss_bbox_dn_4: 0.4116 loss_giou_dn_4: 1.287 time: 1.7057 data_time: 0.0923 lr: 8.75e-05 max_mem: 35758M [03/22 07:36:07] d2.utils.events INFO: eta: 1 day, 17:58:35 iter: 249 total_loss: 20.19 loss_class: 0.2662 loss_bbox: 0.2489 loss_giou: 0.7891 loss_class_0: 0.2492 loss_bbox_0: 0.3056 loss_giou_0: 0.8742 loss_class_1: 0.2407 loss_bbox_1: 0.2779 loss_giou_1: 0.8392 loss_class_2: 0.2484 loss_bbox_2: 0.2564 loss_giou_2: 0.8235 loss_class_3: 0.2454 loss_bbox_3: 0.2571 loss_giou_3: 0.8166 loss_class_4: 0.2626 loss_bbox_4: 0.2519 loss_giou_4: 0.7982 loss_class_enc: 0.2812 loss_bbox_enc: 0.3222 loss_giou_enc: 0.8944 loss_class_dn: 0.1148 loss_bbox_dn: 0.4041 loss_giou_dn: 1.169 loss_class_dn_0: 0.1281 loss_bbox_dn_0: 0.4425 loss_giou_dn_0: 1.279 loss_class_dn_1: 0.121 loss_bbox_dn_1: 0.4171 loss_giou_dn_1: 1.231 loss_class_dn_2: 0.1197 loss_bbox_dn_2: 0.4099 loss_giou_dn_2: 1.224 loss_class_dn_3: 0.1163 loss_bbox_dn_3: 0.4051 loss_giou_dn_3: 1.197 loss_class_dn_4: 0.1167 loss_bbox_dn_4: 0.404 loss_giou_dn_4: 1.175 time: 1.7023 data_time: 0.0893 lr: 8.75e-05 max_mem: 35758M [03/22 07:37:33] d2.utils.events INFO: eta: 1 day, 17:56:07 iter: 299 total_loss: 18.45 loss_class: 0.2668 loss_bbox: 0.2224 loss_giou: 0.6935 loss_class_0: 0.2407 loss_bbox_0: 0.2685 loss_giou_0: 0.7956 loss_class_1: 0.242 loss_bbox_1: 0.2409 loss_giou_1: 0.7479 loss_class_2: 0.2436 loss_bbox_2: 0.2343 loss_giou_2: 0.7208 loss_class_3: 0.2449 loss_bbox_3: 0.2281 loss_giou_3: 0.7081 loss_class_4: 0.2555 loss_bbox_4: 0.2248 loss_giou_4: 0.7055 loss_class_enc: 0.2726 loss_bbox_enc: 0.3073 loss_giou_enc: 0.8746 loss_class_dn: 0.1097 loss_bbox_dn: 0.3894 loss_giou_dn: 1.007 loss_class_dn_0: 0.1267 loss_bbox_dn_0: 0.4148 loss_giou_dn_0: 1.235 loss_class_dn_1: 0.1172 loss_bbox_dn_1: 0.3933 loss_giou_dn_1: 1.115 loss_class_dn_2: 0.1126 loss_bbox_dn_2: 0.3913 loss_giou_dn_2: 1.05 loss_class_dn_3: 0.1102 loss_bbox_dn_3: 0.3896 loss_giou_dn_3: 1.021 loss_class_dn_4: 0.1101 loss_bbox_dn_4: 0.3897 loss_giou_dn_4: 1.006 time: 1.7045 data_time: 0.1029 lr: 8.75e-05 max_mem: 37159M [03/22 07:38:57] d2.utils.events INFO: eta: 1 day, 17:49:19 iter: 349 total_loss: 17.06 loss_class: 0.2471 loss_bbox: 0.193 loss_giou: 0.6671 loss_class_0: 0.2299 loss_bbox_0: 0.2276 loss_giou_0: 0.7185 loss_class_1: 0.2319 loss_bbox_1: 0.2109 loss_giou_1: 0.688 loss_class_2: 0.2343 loss_bbox_2: 0.2024 loss_giou_2: 0.6751 loss_class_3: 0.2362 loss_bbox_3: 0.2026 loss_giou_3: 0.677 loss_class_4: 0.2408 loss_bbox_4: 0.197 loss_giou_4: 0.6731 loss_class_enc: 0.2433 loss_bbox_enc: 0.2633 loss_giou_enc: 0.8022 loss_class_dn: 0.1041 loss_bbox_dn: 0.3402 loss_giou_dn: 0.9189 loss_class_dn_0: 0.1223 loss_bbox_dn_0: 0.3926 loss_giou_dn_0: 1.187 loss_class_dn_1: 0.1106 loss_bbox_dn_1: 0.3596 loss_giou_dn_1: 1.044 loss_class_dn_2: 0.1079 loss_bbox_dn_2: 0.3476 loss_giou_dn_2: 0.9677 loss_class_dn_3: 0.1058 loss_bbox_dn_3: 0.3396 loss_giou_dn_3: 0.9435 loss_class_dn_4: 0.1029 loss_bbox_dn_4: 0.3395 loss_giou_dn_4: 0.9249 time: 1.7026 data_time: 0.0932 lr: 8.75e-05 max_mem: 37159M [03/22 07:40:24] d2.utils.events INFO: eta: 1 day, 17:47:55 iter: 399 total_loss: 16.29 loss_class: 0.2355 loss_bbox: 0.177 loss_giou: 0.6234 loss_class_0: 0.2216 loss_bbox_0: 0.2031 loss_giou_0: 0.6819 loss_class_1: 0.2228 loss_bbox_1: 0.1857 loss_giou_1: 0.6364 loss_class_2: 0.2233 loss_bbox_2: 0.1886 loss_giou_2: 0.6347 loss_class_3: 0.2271 loss_bbox_3: 0.1868 loss_giou_3: 0.6358 loss_class_4: 0.2296 loss_bbox_4: 0.1828 loss_giou_4: 0.6357 loss_class_enc: 0.239 loss_bbox_enc: 0.2636 loss_giou_enc: 0.8254 loss_class_dn: 0.1 loss_bbox_dn: 0.3232 loss_giou_dn: 0.9 loss_class_dn_0: 0.1175 loss_bbox_dn_0: 0.3729 loss_giou_dn_0: 1.164 loss_class_dn_1: 0.1084 loss_bbox_dn_1: 0.3442 loss_giou_dn_1: 1.026 loss_class_dn_2: 0.1055 loss_bbox_dn_2: 0.3269 loss_giou_dn_2: 0.9683 loss_class_dn_3: 0.1038 loss_bbox_dn_3: 0.3219 loss_giou_dn_3: 0.9279 loss_class_dn_4: 0.101 loss_bbox_dn_4: 0.3219 loss_giou_dn_4: 0.9109 time: 1.7073 data_time: 0.1019 lr: 8.75e-05 max_mem: 37159M [03/22 07:41:48] d2.utils.events INFO: eta: 1 day, 17:44:40 iter: 449 total_loss: 16.31 loss_class: 0.2263 loss_bbox: 0.1936 loss_giou: 0.651 loss_class_0: 0.2211 loss_bbox_0: 0.2145 loss_giou_0: 0.6992 loss_class_1: 0.2107 loss_bbox_1: 0.2061 loss_giou_1: 0.6563 loss_class_2: 0.2155 loss_bbox_2: 0.199 loss_giou_2: 0.6534 loss_class_3: 0.2218 loss_bbox_3: 0.195 loss_giou_3: 0.6562 loss_class_4: 0.2203 loss_bbox_4: 0.1952 loss_giou_4: 0.6547 loss_class_enc: 0.2515 loss_bbox_enc: 0.2588 loss_giou_enc: 0.7976 loss_class_dn: 0.09798 loss_bbox_dn: 0.332 loss_giou_dn: 0.9192 loss_class_dn_0: 0.1167 loss_bbox_dn_0: 0.395 loss_giou_dn_0: 1.151 loss_class_dn_1: 0.1053 loss_bbox_dn_1: 0.3494 loss_giou_dn_1: 1.004 loss_class_dn_2: 0.1027 loss_bbox_dn_2: 0.3358 loss_giou_dn_2: 0.9589 loss_class_dn_3: 0.1001 loss_bbox_dn_3: 0.3316 loss_giou_dn_3: 0.9349 loss_class_dn_4: 0.09929 loss_bbox_dn_4: 0.3313 loss_giou_dn_4: 0.9221 time: 1.7044 data_time: 0.0924 lr: 8.75e-05 max_mem: 37159M [03/22 07:43:13] d2.utils.events INFO: eta: 1 day, 17:43:51 iter: 499 total_loss: 15.27 loss_class: 0.2088 loss_bbox: 0.1789 loss_giou: 0.6283 loss_class_0: 0.2153 loss_bbox_0: 0.1952 loss_giou_0: 0.6631 loss_class_1: 0.2032 loss_bbox_1: 0.1875 loss_giou_1: 0.6414 loss_class_2: 0.2045 loss_bbox_2: 0.185 loss_giou_2: 0.631 loss_class_3: 0.2056 loss_bbox_3: 0.1813 loss_giou_3: 0.6371 loss_class_4: 0.2054 loss_bbox_4: 0.1798 loss_giou_4: 0.6328 loss_class_enc: 0.2172 loss_bbox_enc: 0.2334 loss_giou_enc: 0.7511 loss_class_dn: 0.09424 loss_bbox_dn: 0.285 loss_giou_dn: 0.8428 loss_class_dn_0: 0.1153 loss_bbox_dn_0: 0.3642 loss_giou_dn_0: 1.088 loss_class_dn_1: 0.1037 loss_bbox_dn_1: 0.31 loss_giou_dn_1: 0.9406 loss_class_dn_2: 0.09935 loss_bbox_dn_2: 0.2922 loss_giou_dn_2: 0.8666 loss_class_dn_3: 0.09732 loss_bbox_dn_3: 0.287 loss_giou_dn_3: 0.855 loss_class_dn_4: 0.09511 loss_bbox_dn_4: 0.2853 loss_giou_dn_4: 0.8405 time: 1.7025 data_time: 0.0915 lr: 8.75e-05 max_mem: 37159M [03/22 07:44:36] d2.utils.events INFO: eta: 1 day, 17:40:21 iter: 549 total_loss: 15.09 loss_class: 0.2083 loss_bbox: 0.1743 loss_giou: 0.5986 loss_class_0: 0.2084 loss_bbox_0: 0.1963 loss_giou_0: 0.6581 loss_class_1: 0.1935 loss_bbox_1: 0.1881 loss_giou_1: 0.6306 loss_class_2: 0.1997 loss_bbox_2: 0.1833 loss_giou_2: 0.6145 loss_class_3: 0.2078 loss_bbox_3: 0.1782 loss_giou_3: 0.6081 loss_class_4: 0.2042 loss_bbox_4: 0.1761 loss_giou_4: 0.6053 loss_class_enc: 0.2243 loss_bbox_enc: 0.2293 loss_giou_enc: 0.7258 loss_class_dn: 0.0904 loss_bbox_dn: 0.2692 loss_giou_dn: 0.8349 loss_class_dn_0: 0.1094 loss_bbox_dn_0: 0.3313 loss_giou_dn_0: 1.063 loss_class_dn_1: 0.09727 loss_bbox_dn_1: 0.2846 loss_giou_dn_1: 0.9137 loss_class_dn_2: 0.09544 loss_bbox_dn_2: 0.2699 loss_giou_dn_2: 0.8543 loss_class_dn_3: 0.0941 loss_bbox_dn_3: 0.27 loss_giou_dn_3: 0.8416 loss_class_dn_4: 0.08944 loss_bbox_dn_4: 0.2694 loss_giou_dn_4: 0.8381 time: 1.6997 data_time: 0.0858 lr: 8.75e-05 max_mem: 37159M [03/22 07:46:00] d2.utils.events INFO: eta: 1 day, 17:40:36 iter: 599 total_loss: 14.41 loss_class: 0.1939 loss_bbox: 0.1744 loss_giou: 0.5715 loss_class_0: 0.2018 loss_bbox_0: 0.1975 loss_giou_0: 0.6351 loss_class_1: 0.1884 loss_bbox_1: 0.1908 loss_giou_1: 0.5959 loss_class_2: 0.1881 loss_bbox_2: 0.1818 loss_giou_2: 0.578 loss_class_3: 0.1889 loss_bbox_3: 0.1789 loss_giou_3: 0.5771 loss_class_4: 0.1903 loss_bbox_4: 0.1747 loss_giou_4: 0.5745 loss_class_enc: 0.2054 loss_bbox_enc: 0.2374 loss_giou_enc: 0.7283 loss_class_dn: 0.08706 loss_bbox_dn: 0.2579 loss_giou_dn: 0.7784 loss_class_dn_0: 0.1073 loss_bbox_dn_0: 0.3485 loss_giou_dn_0: 1.038 loss_class_dn_1: 0.09717 loss_bbox_dn_1: 0.2931 loss_giou_dn_1: 0.8613 loss_class_dn_2: 0.09119 loss_bbox_dn_2: 0.2663 loss_giou_dn_2: 0.803 loss_class_dn_3: 0.09011 loss_bbox_dn_3: 0.2604 loss_giou_dn_3: 0.7915 loss_class_dn_4: 0.08663 loss_bbox_dn_4: 0.2589 loss_giou_dn_4: 0.7787 time: 1.6981 data_time: 0.0873 lr: 8.75e-05 max_mem: 37159M [03/22 07:47:26] d2.utils.events INFO: eta: 1 day, 17:39:39 iter: 649 total_loss: 14.06 loss_class: 0.1791 loss_bbox: 0.165 loss_giou: 0.5926 loss_class_0: 0.1958 loss_bbox_0: 0.1931 loss_giou_0: 0.6107 loss_class_1: 0.1816 loss_bbox_1: 0.1753 loss_giou_1: 0.5968 loss_class_2: 0.1768 loss_bbox_2: 0.1712 loss_giou_2: 0.5984 loss_class_3: 0.1732 loss_bbox_3: 0.1699 loss_giou_3: 0.5914 loss_class_4: 0.174 loss_bbox_4: 0.166 loss_giou_4: 0.591 loss_class_enc: 0.1976 loss_bbox_enc: 0.2233 loss_giou_enc: 0.6857 loss_class_dn: 0.08716 loss_bbox_dn: 0.2407 loss_giou_dn: 0.7813 loss_class_dn_0: 0.1062 loss_bbox_dn_0: 0.3173 loss_giou_dn_0: 1.01 loss_class_dn_1: 0.09636 loss_bbox_dn_1: 0.2627 loss_giou_dn_1: 0.8658 loss_class_dn_2: 0.09024 loss_bbox_dn_2: 0.2493 loss_giou_dn_2: 0.8134 loss_class_dn_3: 0.08893 loss_bbox_dn_3: 0.243 loss_giou_dn_3: 0.7939 loss_class_dn_4: 0.08621 loss_bbox_dn_4: 0.2411 loss_giou_dn_4: 0.7839 time: 1.6988 data_time: 0.0980 lr: 8.75e-05 max_mem: 37159M [03/22 07:48:51] d2.utils.events INFO: eta: 1 day, 17:39:03 iter: 699 total_loss: 13.65 loss_class: 0.1844 loss_bbox: 0.1619 loss_giou: 0.5666 loss_class_0: 0.2049 loss_bbox_0: 0.1807 loss_giou_0: 0.6043 loss_class_1: 0.1836 loss_bbox_1: 0.1746 loss_giou_1: 0.5891 loss_class_2: 0.1802 loss_bbox_2: 0.1684 loss_giou_2: 0.5799 loss_class_3: 0.186 loss_bbox_3: 0.1649 loss_giou_3: 0.5722 loss_class_4: 0.1849 loss_bbox_4: 0.1651 loss_giou_4: 0.5691 loss_class_enc: 0.2118 loss_bbox_enc: 0.205 loss_giou_enc: 0.6834 loss_class_dn: 0.08508 loss_bbox_dn: 0.2334 loss_giou_dn: 0.7289 loss_class_dn_0: 0.1062 loss_bbox_dn_0: 0.3238 loss_giou_dn_0: 1.005 loss_class_dn_1: 0.09393 loss_bbox_dn_1: 0.2563 loss_giou_dn_1: 0.8328 loss_class_dn_2: 0.08955 loss_bbox_dn_2: 0.2363 loss_giou_dn_2: 0.7638 loss_class_dn_3: 0.08667 loss_bbox_dn_3: 0.2336 loss_giou_dn_3: 0.7387 loss_class_dn_4: 0.08452 loss_bbox_dn_4: 0.2335 loss_giou_dn_4: 0.7293 time: 1.6996 data_time: 0.0999 lr: 8.75e-05 max_mem: 37159M [03/22 07:50:14] d2.utils.events INFO: eta: 1 day, 17:35:28 iter: 749 total_loss: 13.74 loss_class: 0.1715 loss_bbox: 0.1581 loss_giou: 0.5686 loss_class_0: 0.1882 loss_bbox_0: 0.1815 loss_giou_0: 0.599 loss_class_1: 0.1721 loss_bbox_1: 0.1734 loss_giou_1: 0.5833 loss_class_2: 0.1704 loss_bbox_2: 0.167 loss_giou_2: 0.5773 loss_class_3: 0.1682 loss_bbox_3: 0.1647 loss_giou_3: 0.5727 loss_class_4: 0.1707 loss_bbox_4: 0.1601 loss_giou_4: 0.5726 loss_class_enc: 0.1871 loss_bbox_enc: 0.209 loss_giou_enc: 0.6552 loss_class_dn: 0.0864 loss_bbox_dn: 0.2146 loss_giou_dn: 0.7268 loss_class_dn_0: 0.1048 loss_bbox_dn_0: 0.3089 loss_giou_dn_0: 0.9937 loss_class_dn_1: 0.09183 loss_bbox_dn_1: 0.2446 loss_giou_dn_1: 0.8195 loss_class_dn_2: 0.08799 loss_bbox_dn_2: 0.2246 loss_giou_dn_2: 0.7588 loss_class_dn_3: 0.0857 loss_bbox_dn_3: 0.2189 loss_giou_dn_3: 0.7428 loss_class_dn_4: 0.08561 loss_bbox_dn_4: 0.2159 loss_giou_dn_4: 0.7323 time: 1.6965 data_time: 0.0838 lr: 8.75e-05 max_mem: 37159M [03/22 07:51:37] d2.utils.events INFO: eta: 1 day, 17:33:21 iter: 799 total_loss: 13.37 loss_class: 0.1823 loss_bbox: 0.1706 loss_giou: 0.5212 loss_class_0: 0.1929 loss_bbox_0: 0.1881 loss_giou_0: 0.5768 loss_class_1: 0.1742 loss_bbox_1: 0.1822 loss_giou_1: 0.5651 loss_class_2: 0.1759 loss_bbox_2: 0.1785 loss_giou_2: 0.5439 loss_class_3: 0.1785 loss_bbox_3: 0.1738 loss_giou_3: 0.5295 loss_class_4: 0.1829 loss_bbox_4: 0.1703 loss_giou_4: 0.5226 loss_class_enc: 0.1867 loss_bbox_enc: 0.2209 loss_giou_enc: 0.6704 loss_class_dn: 0.08164 loss_bbox_dn: 0.2567 loss_giou_dn: 0.7026 loss_class_dn_0: 0.1006 loss_bbox_dn_0: 0.3323 loss_giou_dn_0: 0.9754 loss_class_dn_1: 0.09063 loss_bbox_dn_1: 0.287 loss_giou_dn_1: 0.7996 loss_class_dn_2: 0.08392 loss_bbox_dn_2: 0.2648 loss_giou_dn_2: 0.7375 loss_class_dn_3: 0.08276 loss_bbox_dn_3: 0.261 loss_giou_dn_3: 0.7123 loss_class_dn_4: 0.0821 loss_bbox_dn_4: 0.2561 loss_giou_dn_4: 0.7031 time: 1.6943 data_time: 0.1045 lr: 8.75e-05 max_mem: 37159M [03/22 07:53:02] d2.utils.events INFO: eta: 1 day, 17:32:05 iter: 849 total_loss: 14.17 loss_class: 0.1836 loss_bbox: 0.1744 loss_giou: 0.5947 loss_class_0: 0.189 loss_bbox_0: 0.2004 loss_giou_0: 0.631 loss_class_1: 0.1786 loss_bbox_1: 0.1891 loss_giou_1: 0.6068 loss_class_2: 0.1783 loss_bbox_2: 0.1843 loss_giou_2: 0.6057 loss_class_3: 0.1755 loss_bbox_3: 0.1762 loss_giou_3: 0.6009 loss_class_4: 0.1804 loss_bbox_4: 0.1744 loss_giou_4: 0.6 loss_class_enc: 0.1889 loss_bbox_enc: 0.2254 loss_giou_enc: 0.7029 loss_class_dn: 0.08326 loss_bbox_dn: 0.2515 loss_giou_dn: 0.7566 loss_class_dn_0: 0.1003 loss_bbox_dn_0: 0.3221 loss_giou_dn_0: 0.9782 loss_class_dn_1: 0.08816 loss_bbox_dn_1: 0.2725 loss_giou_dn_1: 0.8384 loss_class_dn_2: 0.08444 loss_bbox_dn_2: 0.2568 loss_giou_dn_2: 0.7841 loss_class_dn_3: 0.08365 loss_bbox_dn_3: 0.2506 loss_giou_dn_3: 0.7723 loss_class_dn_4: 0.08294 loss_bbox_dn_4: 0.2518 loss_giou_dn_4: 0.7627 time: 1.6951 data_time: 0.1414 lr: 8.75e-05 max_mem: 37159M [03/22 07:54:28] d2.utils.events INFO: eta: 1 day, 17:30:53 iter: 899 total_loss: 13.26 loss_class: 0.195 loss_bbox: 0.1639 loss_giou: 0.5699 loss_class_0: 0.1879 loss_bbox_0: 0.192 loss_giou_0: 0.6118 loss_class_1: 0.1786 loss_bbox_1: 0.1755 loss_giou_1: 0.5916 loss_class_2: 0.1828 loss_bbox_2: 0.169 loss_giou_2: 0.5647 loss_class_3: 0.1853 loss_bbox_3: 0.1657 loss_giou_3: 0.5747 loss_class_4: 0.193 loss_bbox_4: 0.1624 loss_giou_4: 0.5667 loss_class_enc: 0.1792 loss_bbox_enc: 0.2326 loss_giou_enc: 0.7056 loss_class_dn: 0.08688 loss_bbox_dn: 0.2263 loss_giou_dn: 0.7036 loss_class_dn_0: 0.09826 loss_bbox_dn_0: 0.3083 loss_giou_dn_0: 0.9291 loss_class_dn_1: 0.08986 loss_bbox_dn_1: 0.2533 loss_giou_dn_1: 0.7898 loss_class_dn_2: 0.08652 loss_bbox_dn_2: 0.2359 loss_giou_dn_2: 0.7249 loss_class_dn_3: 0.08605 loss_bbox_dn_3: 0.2316 loss_giou_dn_3: 0.712 loss_class_dn_4: 0.08619 loss_bbox_dn_4: 0.2273 loss_giou_dn_4: 0.702 time: 1.6964 data_time: 0.0998 lr: 8.75e-05 max_mem: 37159M [03/22 07:55:52] d2.utils.events INFO: eta: 1 day, 17:28:50 iter: 949 total_loss: 12.56 loss_class: 0.1637 loss_bbox: 0.1553 loss_giou: 0.5348 loss_class_0: 0.1753 loss_bbox_0: 0.1723 loss_giou_0: 0.5823 loss_class_1: 0.165 loss_bbox_1: 0.1598 loss_giou_1: 0.5635 loss_class_2: 0.1593 loss_bbox_2: 0.1563 loss_giou_2: 0.548 loss_class_3: 0.1586 loss_bbox_3: 0.1567 loss_giou_3: 0.5415 loss_class_4: 0.1609 loss_bbox_4: 0.1565 loss_giou_4: 0.5376 loss_class_enc: 0.1735 loss_bbox_enc: 0.1945 loss_giou_enc: 0.647 loss_class_dn: 0.08132 loss_bbox_dn: 0.2171 loss_giou_dn: 0.6849 loss_class_dn_0: 0.09881 loss_bbox_dn_0: 0.294 loss_giou_dn_0: 0.9085 loss_class_dn_1: 0.08741 loss_bbox_dn_1: 0.2385 loss_giou_dn_1: 0.7364 loss_class_dn_2: 0.08319 loss_bbox_dn_2: 0.2225 loss_giou_dn_2: 0.6934 loss_class_dn_3: 0.07955 loss_bbox_dn_3: 0.2198 loss_giou_dn_3: 0.688 loss_class_dn_4: 0.08259 loss_bbox_dn_4: 0.2181 loss_giou_dn_4: 0.6818 time: 1.6954 data_time: 0.1067 lr: 8.75e-05 max_mem: 37159M [03/22 07:57:17] d2.utils.events INFO: eta: 1 day, 17:26:39 iter: 999 total_loss: 12.09 loss_class: 0.1542 loss_bbox: 0.144 loss_giou: 0.51 loss_class_0: 0.1666 loss_bbox_0: 0.1581 loss_giou_0: 0.5555 loss_class_1: 0.1522 loss_bbox_1: 0.151 loss_giou_1: 0.5447 loss_class_2: 0.1514 loss_bbox_2: 0.1483 loss_giou_2: 0.5263 loss_class_3: 0.1487 loss_bbox_3: 0.1449 loss_giou_3: 0.5121 loss_class_4: 0.1488 loss_bbox_4: 0.1442 loss_giou_4: 0.5071 loss_class_enc: 0.1688 loss_bbox_enc: 0.1874 loss_giou_enc: 0.6249 loss_class_dn: 0.07747 loss_bbox_dn: 0.1898 loss_giou_dn: 0.6263 loss_class_dn_0: 0.09446 loss_bbox_dn_0: 0.2813 loss_giou_dn_0: 0.9063 loss_class_dn_1: 0.08266 loss_bbox_dn_1: 0.2169 loss_giou_dn_1: 0.7151 loss_class_dn_2: 0.08029 loss_bbox_dn_2: 0.1973 loss_giou_dn_2: 0.6559 loss_class_dn_3: 0.07757 loss_bbox_dn_3: 0.1937 loss_giou_dn_3: 0.6353 loss_class_dn_4: 0.07731 loss_bbox_dn_4: 0.1906 loss_giou_dn_4: 0.6269 time: 1.6957 data_time: 0.0802 lr: 8.75e-05 max_mem: 37159M [03/22 07:58:43] d2.utils.events INFO: eta: 1 day, 17:23:56 iter: 1049 total_loss: 12.18 loss_class: 0.1692 loss_bbox: 0.1381 loss_giou: 0.5041 loss_class_0: 0.1696 loss_bbox_0: 0.1539 loss_giou_0: 0.5423 loss_class_1: 0.1613 loss_bbox_1: 0.1517 loss_giou_1: 0.5336 loss_class_2: 0.1603 loss_bbox_2: 0.1417 loss_giou_2: 0.5228 loss_class_3: 0.1625 loss_bbox_3: 0.1414 loss_giou_3: 0.5095 loss_class_4: 0.1647 loss_bbox_4: 0.1382 loss_giou_4: 0.5061 loss_class_enc: 0.1675 loss_bbox_enc: 0.1872 loss_giou_enc: 0.6255 loss_class_dn: 0.0781 loss_bbox_dn: 0.201 loss_giou_dn: 0.664 loss_class_dn_0: 0.0947 loss_bbox_dn_0: 0.2766 loss_giou_dn_0: 0.8954 loss_class_dn_1: 0.08068 loss_bbox_dn_1: 0.2231 loss_giou_dn_1: 0.732 loss_class_dn_2: 0.07935 loss_bbox_dn_2: 0.2064 loss_giou_dn_2: 0.6852 loss_class_dn_3: 0.07754 loss_bbox_dn_3: 0.2028 loss_giou_dn_3: 0.6738 loss_class_dn_4: 0.07707 loss_bbox_dn_4: 0.2019 loss_giou_dn_4: 0.6656 time: 1.6968 data_time: 0.1012 lr: 8.75e-05 max_mem: 37159M [03/22 08:00:07] d2.utils.events INFO: eta: 1 day, 17:20:23 iter: 1099 total_loss: 12.99 loss_class: 0.1694 loss_bbox: 0.1552 loss_giou: 0.542 loss_class_0: 0.1768 loss_bbox_0: 0.1726 loss_giou_0: 0.5838 loss_class_1: 0.1611 loss_bbox_1: 0.1621 loss_giou_1: 0.5683 loss_class_2: 0.1657 loss_bbox_2: 0.1566 loss_giou_2: 0.5479 loss_class_3: 0.1648 loss_bbox_3: 0.1559 loss_giou_3: 0.5476 loss_class_4: 0.1682 loss_bbox_4: 0.156 loss_giou_4: 0.5439 loss_class_enc: 0.1755 loss_bbox_enc: 0.2063 loss_giou_enc: 0.6826 loss_class_dn: 0.08244 loss_bbox_dn: 0.2201 loss_giou_dn: 0.6866 loss_class_dn_0: 0.09977 loss_bbox_dn_0: 0.281 loss_giou_dn_0: 0.9103 loss_class_dn_1: 0.08814 loss_bbox_dn_1: 0.2342 loss_giou_dn_1: 0.7511 loss_class_dn_2: 0.08325 loss_bbox_dn_2: 0.2246 loss_giou_dn_2: 0.7055 loss_class_dn_3: 0.0821 loss_bbox_dn_3: 0.2211 loss_giou_dn_3: 0.695 loss_class_dn_4: 0.08193 loss_bbox_dn_4: 0.2206 loss_giou_dn_4: 0.6866 time: 1.6961 data_time: 0.0784 lr: 8.75e-05 max_mem: 37159M [03/22 08:01:32] d2.utils.events INFO: eta: 1 day, 17:17:43 iter: 1149 total_loss: 11.79 loss_class: 0.1478 loss_bbox: 0.1404 loss_giou: 0.5275 loss_class_0: 0.1653 loss_bbox_0: 0.1617 loss_giou_0: 0.5805 loss_class_1: 0.1471 loss_bbox_1: 0.1497 loss_giou_1: 0.5496 loss_class_2: 0.1467 loss_bbox_2: 0.143 loss_giou_2: 0.5316 loss_class_3: 0.1439 loss_bbox_3: 0.1447 loss_giou_3: 0.5328 loss_class_4: 0.1472 loss_bbox_4: 0.1402 loss_giou_4: 0.5256 loss_class_enc: 0.1619 loss_bbox_enc: 0.1974 loss_giou_enc: 0.6362 loss_class_dn: 0.07781 loss_bbox_dn: 0.2026 loss_giou_dn: 0.6383 loss_class_dn_0: 0.09928 loss_bbox_dn_0: 0.2749 loss_giou_dn_0: 0.8635 loss_class_dn_1: 0.0833 loss_bbox_dn_1: 0.2231 loss_giou_dn_1: 0.6933 loss_class_dn_2: 0.08035 loss_bbox_dn_2: 0.2106 loss_giou_dn_2: 0.6552 loss_class_dn_3: 0.07842 loss_bbox_dn_3: 0.2059 loss_giou_dn_3: 0.6429 loss_class_dn_4: 0.0784 loss_bbox_dn_4: 0.2032 loss_giou_dn_4: 0.6396 time: 1.6956 data_time: 0.0796 lr: 8.75e-05 max_mem: 37159M [03/22 08:02:57] d2.utils.events INFO: eta: 1 day, 17:15:23 iter: 1199 total_loss: 11.94 loss_class: 0.1579 loss_bbox: 0.1386 loss_giou: 0.5359 loss_class_0: 0.1683 loss_bbox_0: 0.1563 loss_giou_0: 0.5764 loss_class_1: 0.1529 loss_bbox_1: 0.1488 loss_giou_1: 0.545 loss_class_2: 0.1546 loss_bbox_2: 0.144 loss_giou_2: 0.5376 loss_class_3: 0.1545 loss_bbox_3: 0.1412 loss_giou_3: 0.5409 loss_class_4: 0.1542 loss_bbox_4: 0.1401 loss_giou_4: 0.5351 loss_class_enc: 0.1639 loss_bbox_enc: 0.1873 loss_giou_enc: 0.637 loss_class_dn: 0.07472 loss_bbox_dn: 0.2049 loss_giou_dn: 0.6674 loss_class_dn_0: 0.09451 loss_bbox_dn_0: 0.2663 loss_giou_dn_0: 0.8928 loss_class_dn_1: 0.08027 loss_bbox_dn_1: 0.2211 loss_giou_dn_1: 0.7274 loss_class_dn_2: 0.07827 loss_bbox_dn_2: 0.2084 loss_giou_dn_2: 0.6863 loss_class_dn_3: 0.07435 loss_bbox_dn_3: 0.2054 loss_giou_dn_3: 0.6766 loss_class_dn_4: 0.07392 loss_bbox_dn_4: 0.2047 loss_giou_dn_4: 0.6691 time: 1.6965 data_time: 0.0895 lr: 8.75e-05 max_mem: 37159M [03/22 08:04:22] d2.utils.events INFO: eta: 1 day, 17:13:06 iter: 1249 total_loss: 12.49 loss_class: 0.1547 loss_bbox: 0.1495 loss_giou: 0.5385 loss_class_0: 0.1651 loss_bbox_0: 0.161 loss_giou_0: 0.5887 loss_class_1: 0.1583 loss_bbox_1: 0.1592 loss_giou_1: 0.5693 loss_class_2: 0.1589 loss_bbox_2: 0.1557 loss_giou_2: 0.5504 loss_class_3: 0.1546 loss_bbox_3: 0.1547 loss_giou_3: 0.5459 loss_class_4: 0.1558 loss_bbox_4: 0.1525 loss_giou_4: 0.5407 loss_class_enc: 0.159 loss_bbox_enc: 0.1999 loss_giou_enc: 0.6605 loss_class_dn: 0.07681 loss_bbox_dn: 0.2059 loss_giou_dn: 0.6953 loss_class_dn_0: 0.0924 loss_bbox_dn_0: 0.2639 loss_giou_dn_0: 0.9066 loss_class_dn_1: 0.08262 loss_bbox_dn_1: 0.2183 loss_giou_dn_1: 0.7623 loss_class_dn_2: 0.07817 loss_bbox_dn_2: 0.2098 loss_giou_dn_2: 0.7176 loss_class_dn_3: 0.07777 loss_bbox_dn_3: 0.2078 loss_giou_dn_3: 0.704 loss_class_dn_4: 0.07676 loss_bbox_dn_4: 0.2062 loss_giou_dn_4: 0.6944 time: 1.6959 data_time: 0.0946 lr: 8.75e-05 max_mem: 37231M [03/22 08:05:47] d2.utils.events INFO: eta: 1 day, 17:12:36 iter: 1299 total_loss: 11.57 loss_class: 0.151 loss_bbox: 0.1392 loss_giou: 0.4934 loss_class_0: 0.159 loss_bbox_0: 0.1593 loss_giou_0: 0.553 loss_class_1: 0.1499 loss_bbox_1: 0.15 loss_giou_1: 0.5234 loss_class_2: 0.1464 loss_bbox_2: 0.1463 loss_giou_2: 0.5119 loss_class_3: 0.1481 loss_bbox_3: 0.1442 loss_giou_3: 0.5118 loss_class_4: 0.1507 loss_bbox_4: 0.1403 loss_giou_4: 0.4994 loss_class_enc: 0.1634 loss_bbox_enc: 0.1957 loss_giou_enc: 0.613 loss_class_dn: 0.075 loss_bbox_dn: 0.2064 loss_giou_dn: 0.6271 loss_class_dn_0: 0.09269 loss_bbox_dn_0: 0.2514 loss_giou_dn_0: 0.8512 loss_class_dn_1: 0.08065 loss_bbox_dn_1: 0.2203 loss_giou_dn_1: 0.6998 loss_class_dn_2: 0.07598 loss_bbox_dn_2: 0.2114 loss_giou_dn_2: 0.6474 loss_class_dn_3: 0.07463 loss_bbox_dn_3: 0.2077 loss_giou_dn_3: 0.6355 loss_class_dn_4: 0.07414 loss_bbox_dn_4: 0.2067 loss_giou_dn_4: 0.6281 time: 1.6962 data_time: 0.0859 lr: 8.75e-05 max_mem: 37231M [03/22 08:07:12] d2.utils.events INFO: eta: 1 day, 17:11:34 iter: 1349 total_loss: 12.33 loss_class: 0.1466 loss_bbox: 0.1473 loss_giou: 0.503 loss_class_0: 0.1629 loss_bbox_0: 0.1595 loss_giou_0: 0.5554 loss_class_1: 0.1482 loss_bbox_1: 0.1578 loss_giou_1: 0.5266 loss_class_2: 0.1479 loss_bbox_2: 0.155 loss_giou_2: 0.5224 loss_class_3: 0.1461 loss_bbox_3: 0.1507 loss_giou_3: 0.5095 loss_class_4: 0.1477 loss_bbox_4: 0.1473 loss_giou_4: 0.5087 loss_class_enc: 0.1568 loss_bbox_enc: 0.2008 loss_giou_enc: 0.6514 loss_class_dn: 0.07703 loss_bbox_dn: 0.2034 loss_giou_dn: 0.6469 loss_class_dn_0: 0.09315 loss_bbox_dn_0: 0.2687 loss_giou_dn_0: 0.8816 loss_class_dn_1: 0.08248 loss_bbox_dn_1: 0.2203 loss_giou_dn_1: 0.7219 loss_class_dn_2: 0.08016 loss_bbox_dn_2: 0.2099 loss_giou_dn_2: 0.6748 loss_class_dn_3: 0.07731 loss_bbox_dn_3: 0.206 loss_giou_dn_3: 0.6599 loss_class_dn_4: 0.07658 loss_bbox_dn_4: 0.2042 loss_giou_dn_4: 0.6504 time: 1.6969 data_time: 0.0881 lr: 8.75e-05 max_mem: 37231M [03/22 08:08:38] d2.utils.events INFO: eta: 1 day, 17:11:04 iter: 1399 total_loss: 11.55 loss_class: 0.1602 loss_bbox: 0.1432 loss_giou: 0.4626 loss_class_0: 0.1594 loss_bbox_0: 0.1692 loss_giou_0: 0.5184 loss_class_1: 0.1526 loss_bbox_1: 0.1607 loss_giou_1: 0.4916 loss_class_2: 0.152 loss_bbox_2: 0.1517 loss_giou_2: 0.4711 loss_class_3: 0.1561 loss_bbox_3: 0.1495 loss_giou_3: 0.4661 loss_class_4: 0.1578 loss_bbox_4: 0.1452 loss_giou_4: 0.4616 loss_class_enc: 0.1608 loss_bbox_enc: 0.196 loss_giou_enc: 0.6032 loss_class_dn: 0.07718 loss_bbox_dn: 0.2039 loss_giou_dn: 0.6076 loss_class_dn_0: 0.08962 loss_bbox_dn_0: 0.266 loss_giou_dn_0: 0.8226 loss_class_dn_1: 0.07749 loss_bbox_dn_1: 0.2255 loss_giou_dn_1: 0.6809 loss_class_dn_2: 0.07645 loss_bbox_dn_2: 0.2091 loss_giou_dn_2: 0.6347 loss_class_dn_3: 0.07424 loss_bbox_dn_3: 0.2051 loss_giou_dn_3: 0.6125 loss_class_dn_4: 0.07407 loss_bbox_dn_4: 0.2041 loss_giou_dn_4: 0.6101 time: 1.6977 data_time: 0.0915 lr: 8.75e-05 max_mem: 37231M [03/22 08:10:03] d2.utils.events INFO: eta: 1 day, 17:10:11 iter: 1449 total_loss: 10.9 loss_class: 0.1418 loss_bbox: 0.1314 loss_giou: 0.462 loss_class_0: 0.1515 loss_bbox_0: 0.1515 loss_giou_0: 0.5158 loss_class_1: 0.1394 loss_bbox_1: 0.1448 loss_giou_1: 0.4923 loss_class_2: 0.138 loss_bbox_2: 0.1346 loss_giou_2: 0.4728 loss_class_3: 0.1367 loss_bbox_3: 0.1329 loss_giou_3: 0.4679 loss_class_4: 0.1388 loss_bbox_4: 0.1323 loss_giou_4: 0.4668 loss_class_enc: 0.1557 loss_bbox_enc: 0.1784 loss_giou_enc: 0.5894 loss_class_dn: 0.06912 loss_bbox_dn: 0.175 loss_giou_dn: 0.5736 loss_class_dn_0: 0.08398 loss_bbox_dn_0: 0.2439 loss_giou_dn_0: 0.802 loss_class_dn_1: 0.07102 loss_bbox_dn_1: 0.1915 loss_giou_dn_1: 0.6337 loss_class_dn_2: 0.06881 loss_bbox_dn_2: 0.1801 loss_giou_dn_2: 0.5972 loss_class_dn_3: 0.06929 loss_bbox_dn_3: 0.1772 loss_giou_dn_3: 0.5819 loss_class_dn_4: 0.06879 loss_bbox_dn_4: 0.1754 loss_giou_dn_4: 0.5758 time: 1.6976 data_time: 0.0873 lr: 8.75e-05 max_mem: 37231M [03/22 08:11:27] d2.utils.events INFO: eta: 1 day, 17:05:53 iter: 1499 total_loss: 10.86 loss_class: 0.1457 loss_bbox: 0.1242 loss_giou: 0.4622 loss_class_0: 0.1538 loss_bbox_0: 0.1396 loss_giou_0: 0.4953 loss_class_1: 0.139 loss_bbox_1: 0.1326 loss_giou_1: 0.4795 loss_class_2: 0.1385 loss_bbox_2: 0.1276 loss_giou_2: 0.4813 loss_class_3: 0.1369 loss_bbox_3: 0.1246 loss_giou_3: 0.4704 loss_class_4: 0.1427 loss_bbox_4: 0.1231 loss_giou_4: 0.4631 loss_class_enc: 0.1502 loss_bbox_enc: 0.1657 loss_giou_enc: 0.5851 loss_class_dn: 0.07162 loss_bbox_dn: 0.1792 loss_giou_dn: 0.5779 loss_class_dn_0: 0.08523 loss_bbox_dn_0: 0.2413 loss_giou_dn_0: 0.807 loss_class_dn_1: 0.07456 loss_bbox_dn_1: 0.1873 loss_giou_dn_1: 0.6452 loss_class_dn_2: 0.07139 loss_bbox_dn_2: 0.1781 loss_giou_dn_2: 0.5966 loss_class_dn_3: 0.07137 loss_bbox_dn_3: 0.1782 loss_giou_dn_3: 0.5822 loss_class_dn_4: 0.07086 loss_bbox_dn_4: 0.1783 loss_giou_dn_4: 0.5793 time: 1.6966 data_time: 0.1289 lr: 8.75e-05 max_mem: 37231M [03/22 08:12:52] d2.utils.events INFO: eta: 1 day, 17:07:49 iter: 1549 total_loss: 11.32 loss_class: 0.1406 loss_bbox: 0.1275 loss_giou: 0.4916 loss_class_0: 0.1473 loss_bbox_0: 0.1474 loss_giou_0: 0.5348 loss_class_1: 0.137 loss_bbox_1: 0.1365 loss_giou_1: 0.5148 loss_class_2: 0.1365 loss_bbox_2: 0.1306 loss_giou_2: 0.4984 loss_class_3: 0.1366 loss_bbox_3: 0.1295 loss_giou_3: 0.4984 loss_class_4: 0.135 loss_bbox_4: 0.1278 loss_giou_4: 0.4906 loss_class_enc: 0.1452 loss_bbox_enc: 0.1738 loss_giou_enc: 0.5899 loss_class_dn: 0.0688 loss_bbox_dn: 0.1755 loss_giou_dn: 0.6077 loss_class_dn_0: 0.08525 loss_bbox_dn_0: 0.247 loss_giou_dn_0: 0.8162 loss_class_dn_1: 0.07323 loss_bbox_dn_1: 0.1916 loss_giou_dn_1: 0.6669 loss_class_dn_2: 0.06991 loss_bbox_dn_2: 0.1824 loss_giou_dn_2: 0.6279 loss_class_dn_3: 0.06738 loss_bbox_dn_3: 0.1789 loss_giou_dn_3: 0.6169 loss_class_dn_4: 0.06743 loss_bbox_dn_4: 0.1759 loss_giou_dn_4: 0.6096 time: 1.6968 data_time: 0.1027 lr: 8.75e-05 max_mem: 37231M [03/22 08:14:18] d2.utils.events INFO: eta: 1 day, 17:06:26 iter: 1599 total_loss: 11.09 loss_class: 0.1415 loss_bbox: 0.1307 loss_giou: 0.4606 loss_class_0: 0.1435 loss_bbox_0: 0.146 loss_giou_0: 0.5006 loss_class_1: 0.1344 loss_bbox_1: 0.142 loss_giou_1: 0.4731 loss_class_2: 0.1332 loss_bbox_2: 0.1373 loss_giou_2: 0.4632 loss_class_3: 0.136 loss_bbox_3: 0.1324 loss_giou_3: 0.4605 loss_class_4: 0.1386 loss_bbox_4: 0.1318 loss_giou_4: 0.4586 loss_class_enc: 0.1483 loss_bbox_enc: 0.1649 loss_giou_enc: 0.5796 loss_class_dn: 0.07154 loss_bbox_dn: 0.182 loss_giou_dn: 0.5868 loss_class_dn_0: 0.09042 loss_bbox_dn_0: 0.24 loss_giou_dn_0: 0.8159 loss_class_dn_1: 0.0752 loss_bbox_dn_1: 0.1958 loss_giou_dn_1: 0.6645 loss_class_dn_2: 0.07324 loss_bbox_dn_2: 0.1886 loss_giou_dn_2: 0.6191 loss_class_dn_3: 0.07183 loss_bbox_dn_3: 0.1827 loss_giou_dn_3: 0.5945 loss_class_dn_4: 0.07116 loss_bbox_dn_4: 0.1819 loss_giou_dn_4: 0.5886 time: 1.6975 data_time: 0.0993 lr: 8.75e-05 max_mem: 37231M [03/22 08:15:44] d2.utils.events INFO: eta: 1 day, 17:04:28 iter: 1649 total_loss: 10.66 loss_class: 0.1405 loss_bbox: 0.1271 loss_giou: 0.4609 loss_class_0: 0.1521 loss_bbox_0: 0.1425 loss_giou_0: 0.5033 loss_class_1: 0.1409 loss_bbox_1: 0.132 loss_giou_1: 0.4801 loss_class_2: 0.1398 loss_bbox_2: 0.1304 loss_giou_2: 0.4727 loss_class_3: 0.1353 loss_bbox_3: 0.1315 loss_giou_3: 0.4537 loss_class_4: 0.1358 loss_bbox_4: 0.1278 loss_giou_4: 0.4525 loss_class_enc: 0.1439 loss_bbox_enc: 0.1813 loss_giou_enc: 0.6004 loss_class_dn: 0.06718 loss_bbox_dn: 0.1707 loss_giou_dn: 0.5771 loss_class_dn_0: 0.0846 loss_bbox_dn_0: 0.2364 loss_giou_dn_0: 0.7837 loss_class_dn_1: 0.07291 loss_bbox_dn_1: 0.1843 loss_giou_dn_1: 0.6231 loss_class_dn_2: 0.07064 loss_bbox_dn_2: 0.1742 loss_giou_dn_2: 0.596 loss_class_dn_3: 0.06621 loss_bbox_dn_3: 0.1719 loss_giou_dn_3: 0.5832 loss_class_dn_4: 0.06592 loss_bbox_dn_4: 0.1711 loss_giou_dn_4: 0.5856 time: 1.6981 data_time: 0.0894 lr: 8.75e-05 max_mem: 37231M [03/22 08:17:07] d2.utils.events INFO: eta: 1 day, 17:01:49 iter: 1699 total_loss: 10.47 loss_class: 0.1324 loss_bbox: 0.1162 loss_giou: 0.4514 loss_class_0: 0.1389 loss_bbox_0: 0.134 loss_giou_0: 0.5216 loss_class_1: 0.1264 loss_bbox_1: 0.1235 loss_giou_1: 0.4942 loss_class_2: 0.1264 loss_bbox_2: 0.1205 loss_giou_2: 0.4714 loss_class_3: 0.1275 loss_bbox_3: 0.1181 loss_giou_3: 0.4553 loss_class_4: 0.1308 loss_bbox_4: 0.1176 loss_giou_4: 0.4559 loss_class_enc: 0.137 loss_bbox_enc: 0.1753 loss_giou_enc: 0.6108 loss_class_dn: 0.07162 loss_bbox_dn: 0.1666 loss_giou_dn: 0.5837 loss_class_dn_0: 0.08588 loss_bbox_dn_0: 0.2239 loss_giou_dn_0: 0.7905 loss_class_dn_1: 0.07493 loss_bbox_dn_1: 0.1789 loss_giou_dn_1: 0.6184 loss_class_dn_2: 0.07033 loss_bbox_dn_2: 0.1681 loss_giou_dn_2: 0.5879 loss_class_dn_3: 0.07015 loss_bbox_dn_3: 0.1679 loss_giou_dn_3: 0.5859 loss_class_dn_4: 0.07162 loss_bbox_dn_4: 0.1668 loss_giou_dn_4: 0.5841 time: 1.6975 data_time: 0.0834 lr: 8.75e-05 max_mem: 37231M [03/22 08:18:32] d2.utils.events INFO: eta: 1 day, 17:00:45 iter: 1749 total_loss: 10.69 loss_class: 0.1604 loss_bbox: 0.113 loss_giou: 0.4269 loss_class_0: 0.1496 loss_bbox_0: 0.1385 loss_giou_0: 0.4928 loss_class_1: 0.1495 loss_bbox_1: 0.1258 loss_giou_1: 0.46 loss_class_2: 0.1498 loss_bbox_2: 0.1208 loss_giou_2: 0.443 loss_class_3: 0.1576 loss_bbox_3: 0.1172 loss_giou_3: 0.4347 loss_class_4: 0.1597 loss_bbox_4: 0.1122 loss_giou_4: 0.4198 loss_class_enc: 0.1479 loss_bbox_enc: 0.1834 loss_giou_enc: 0.6324 loss_class_dn: 0.07317 loss_bbox_dn: 0.1723 loss_giou_dn: 0.5653 loss_class_dn_0: 0.08305 loss_bbox_dn_0: 0.2344 loss_giou_dn_0: 0.7828 loss_class_dn_1: 0.07378 loss_bbox_dn_1: 0.1818 loss_giou_dn_1: 0.6282 loss_class_dn_2: 0.0719 loss_bbox_dn_2: 0.174 loss_giou_dn_2: 0.5831 loss_class_dn_3: 0.07162 loss_bbox_dn_3: 0.1714 loss_giou_dn_3: 0.5731 loss_class_dn_4: 0.07229 loss_bbox_dn_4: 0.1717 loss_giou_dn_4: 0.5649 time: 1.6976 data_time: 0.0813 lr: 8.75e-05 max_mem: 37231M [03/22 08:20:00] d2.utils.events INFO: eta: 1 day, 17:01:39 iter: 1799 total_loss: 11.11 loss_class: 0.1443 loss_bbox: 0.1223 loss_giou: 0.4518 loss_class_0: 0.1507 loss_bbox_0: 0.145 loss_giou_0: 0.5009 loss_class_1: 0.1505 loss_bbox_1: 0.1303 loss_giou_1: 0.4726 loss_class_2: 0.144 loss_bbox_2: 0.127 loss_giou_2: 0.4583 loss_class_3: 0.1405 loss_bbox_3: 0.1261 loss_giou_3: 0.4561 loss_class_4: 0.1384 loss_bbox_4: 0.1243 loss_giou_4: 0.4533 loss_class_enc: 0.144 loss_bbox_enc: 0.178 loss_giou_enc: 0.5839 loss_class_dn: 0.07304 loss_bbox_dn: 0.1702 loss_giou_dn: 0.6079 loss_class_dn_0: 0.08546 loss_bbox_dn_0: 0.2411 loss_giou_dn_0: 0.8054 loss_class_dn_1: 0.07644 loss_bbox_dn_1: 0.1882 loss_giou_dn_1: 0.6505 loss_class_dn_2: 0.07327 loss_bbox_dn_2: 0.1737 loss_giou_dn_2: 0.6141 loss_class_dn_3: 0.07276 loss_bbox_dn_3: 0.172 loss_giou_dn_3: 0.6108 loss_class_dn_4: 0.07187 loss_bbox_dn_4: 0.1699 loss_giou_dn_4: 0.6066 time: 1.6990 data_time: 0.1577 lr: 8.75e-05 max_mem: 37231M [03/22 08:21:25] d2.utils.events INFO: eta: 1 day, 16:59:42 iter: 1849 total_loss: 10.51 loss_class: 0.1226 loss_bbox: 0.1197 loss_giou: 0.4523 loss_class_0: 0.1313 loss_bbox_0: 0.1348 loss_giou_0: 0.4956 loss_class_1: 0.1247 loss_bbox_1: 0.1248 loss_giou_1: 0.47 loss_class_2: 0.1188 loss_bbox_2: 0.1222 loss_giou_2: 0.4543 loss_class_3: 0.1184 loss_bbox_3: 0.1216 loss_giou_3: 0.4552 loss_class_4: 0.1218 loss_bbox_4: 0.1191 loss_giou_4: 0.4509 loss_class_enc: 0.1282 loss_bbox_enc: 0.1594 loss_giou_enc: 0.5732 loss_class_dn: 0.06765 loss_bbox_dn: 0.159 loss_giou_dn: 0.5663 loss_class_dn_0: 0.08217 loss_bbox_dn_0: 0.2211 loss_giou_dn_0: 0.7629 loss_class_dn_1: 0.07152 loss_bbox_dn_1: 0.1766 loss_giou_dn_1: 0.6082 loss_class_dn_2: 0.06748 loss_bbox_dn_2: 0.1639 loss_giou_dn_2: 0.5758 loss_class_dn_3: 0.06634 loss_bbox_dn_3: 0.1611 loss_giou_dn_3: 0.5643 loss_class_dn_4: 0.0659 loss_bbox_dn_4: 0.1582 loss_giou_dn_4: 0.5618 time: 1.6992 data_time: 0.0958 lr: 8.75e-05 max_mem: 37231M [03/22 08:22:50] d2.utils.events INFO: eta: 1 day, 16:56:51 iter: 1899 total_loss: 10.43 loss_class: 0.1397 loss_bbox: 0.1248 loss_giou: 0.4304 loss_class_0: 0.1442 loss_bbox_0: 0.1459 loss_giou_0: 0.4851 loss_class_1: 0.1391 loss_bbox_1: 0.1339 loss_giou_1: 0.4651 loss_class_2: 0.1415 loss_bbox_2: 0.1281 loss_giou_2: 0.4532 loss_class_3: 0.137 loss_bbox_3: 0.1283 loss_giou_3: 0.4452 loss_class_4: 0.1395 loss_bbox_4: 0.1265 loss_giou_4: 0.4357 loss_class_enc: 0.1436 loss_bbox_enc: 0.1772 loss_giou_enc: 0.5704 loss_class_dn: 0.07317 loss_bbox_dn: 0.174 loss_giou_dn: 0.5593 loss_class_dn_0: 0.08799 loss_bbox_dn_0: 0.2437 loss_giou_dn_0: 0.7942 loss_class_dn_1: 0.07689 loss_bbox_dn_1: 0.1917 loss_giou_dn_1: 0.6259 loss_class_dn_2: 0.07038 loss_bbox_dn_2: 0.1791 loss_giou_dn_2: 0.5812 loss_class_dn_3: 0.06998 loss_bbox_dn_3: 0.1765 loss_giou_dn_3: 0.5621 loss_class_dn_4: 0.07011 loss_bbox_dn_4: 0.1745 loss_giou_dn_4: 0.5607 time: 1.6989 data_time: 0.0877 lr: 8.75e-05 max_mem: 37231M [03/22 08:24:16] d2.utils.events INFO: eta: 1 day, 17:00:04 iter: 1949 total_loss: 9.643 loss_class: 0.1323 loss_bbox: 0.1145 loss_giou: 0.3965 loss_class_0: 0.1362 loss_bbox_0: 0.1432 loss_giou_0: 0.4812 loss_class_1: 0.1277 loss_bbox_1: 0.1288 loss_giou_1: 0.4482 loss_class_2: 0.1273 loss_bbox_2: 0.1211 loss_giou_2: 0.4171 loss_class_3: 0.1286 loss_bbox_3: 0.1182 loss_giou_3: 0.4144 loss_class_4: 0.1332 loss_bbox_4: 0.1143 loss_giou_4: 0.401 loss_class_enc: 0.132 loss_bbox_enc: 0.1791 loss_giou_enc: 0.5527 loss_class_dn: 0.06615 loss_bbox_dn: 0.1606 loss_giou_dn: 0.5112 loss_class_dn_0: 0.0805 loss_bbox_dn_0: 0.2121 loss_giou_dn_0: 0.7337 loss_class_dn_1: 0.06833 loss_bbox_dn_1: 0.173 loss_giou_dn_1: 0.5823 loss_class_dn_2: 0.06538 loss_bbox_dn_2: 0.165 loss_giou_dn_2: 0.5313 loss_class_dn_3: 0.06437 loss_bbox_dn_3: 0.1613 loss_giou_dn_3: 0.5253 loss_class_dn_4: 0.06525 loss_bbox_dn_4: 0.1607 loss_giou_dn_4: 0.5183 time: 1.6998 data_time: 0.1020 lr: 8.75e-05 max_mem: 37231M [03/22 08:25:40] fvcore.common.checkpoint INFO: Saving checkpoint to ./output/dino_r50_4scale_12ep/model_0001999.pth [03/22 08:25:41] detectron2 INFO: Run evaluation without EMA. [03/22 08:25:41] d2.data.datasets.coco WARNING: Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you. [03/22 08:25:41] d2.data.datasets.coco INFO: Loaded 693 images in COCO format from datasets/coab/annotations/test.json [03/22 08:25:41] d2.data.build INFO: Distribution of instances among all 1 categories: | category | #instances | |:----------:|:-------------| | object | 14776 | | | | [03/22 08:25:41] d2.data.common INFO: Serializing 693 elements to byte tensors and concatenating them all ... [03/22 08:25:41] d2.data.common INFO: Serialized dataset takes 4.52 MiB [03/22 08:25:41] d2.evaluation.evaluator INFO: Start inference on 693 batches [03/22 08:25:42] d2.evaluation.evaluator INFO: Inference done 11/693. Dataloading: 0.0191 s/iter. Inference: 0.0637 s/iter. Eval: 0.0004 s/iter. Total: 0.0832 s/iter. ETA=0:00:56 [03/22 08:25:47] d2.evaluation.evaluator INFO: Inference done 69/693. Dataloading: 0.0247 s/iter. Inference: 0.0614 s/iter. Eval: 0.0004 s/iter. Total: 0.0865 s/iter. ETA=0:00:54 [03/22 08:25:52] d2.evaluation.evaluator INFO: Inference done 125/693. Dataloading: 0.0249 s/iter. Inference: 0.0608 s/iter. Eval: 0.0021 s/iter. Total: 0.0878 s/iter. ETA=0:00:49 [03/22 08:25:57] d2.evaluation.evaluator INFO: Inference done 183/693. Dataloading: 0.0251 s/iter. Inference: 0.0608 s/iter. Eval: 0.0015 s/iter. Total: 0.0875 s/iter. ETA=0:00:44 [03/22 08:26:02] d2.evaluation.evaluator INFO: Inference done 242/693. Dataloading: 0.0250 s/iter. Inference: 0.0607 s/iter. Eval: 0.0012 s/iter. Total: 0.0870 s/iter. ETA=0:00:39 [03/22 08:26:07] d2.evaluation.evaluator INFO: Inference done 301/693. Dataloading: 0.0249 s/iter. Inference: 0.0606 s/iter. Eval: 0.0011 s/iter. Total: 0.0866 s/iter. ETA=0:00:33 [03/22 08:26:12] d2.evaluation.evaluator INFO: Inference done 360/693. Dataloading: 0.0248 s/iter. Inference: 0.0606 s/iter. Eval: 0.0009 s/iter. Total: 0.0865 s/iter. ETA=0:00:28 [03/22 08:26:17] d2.evaluation.evaluator INFO: Inference done 414/693. Dataloading: 0.0252 s/iter. Inference: 0.0607 s/iter. Eval: 0.0014 s/iter. Total: 0.0874 s/iter. ETA=0:00:24 [03/22 08:26:22] d2.evaluation.evaluator INFO: Inference done 473/693. Dataloading: 0.0252 s/iter. Inference: 0.0606 s/iter. Eval: 0.0013 s/iter. Total: 0.0871 s/iter. ETA=0:00:19 [03/22 08:26:27] d2.evaluation.evaluator INFO: Inference done 530/693. Dataloading: 0.0254 s/iter. Inference: 0.0605 s/iter. Eval: 0.0012 s/iter. Total: 0.0872 s/iter. ETA=0:00:14 [03/22 08:26:32] d2.evaluation.evaluator INFO: Inference done 588/693. Dataloading: 0.0255 s/iter. Inference: 0.0605 s/iter. Eval: 0.0011 s/iter. Total: 0.0872 s/iter. ETA=0:00:09 [03/22 08:26:37] d2.evaluation.evaluator INFO: Inference done 648/693. Dataloading: 0.0254 s/iter. Inference: 0.0604 s/iter. Eval: 0.0010 s/iter. Total: 0.0869 s/iter. ETA=0:00:03 [03/22 08:26:41] d2.evaluation.evaluator INFO: Total inference time: 0:00:59.807496 (0.086929 s / iter per device, on 1 devices) [03/22 08:26:41] d2.evaluation.evaluator INFO: Total inference pure compute time: 0:00:41 (0.060475 s / iter per device, on 1 devices) [03/22 08:26:42] d2.evaluation.coco_evaluation INFO: Preparing results for COCO format ... [03/22 08:26:42] d2.evaluation.coco_evaluation INFO: Saving results to ./output/dino_r50_4scale_12ep/coco_instances_results.json [03/22 08:26:43] d2.evaluation.coco_evaluation INFO: Evaluating predictions with unofficial COCO API... [03/22 08:26:44] d2.evaluation.fast_eval_api INFO: Evaluate annotation type *bbox* [03/22 08:26:44] d2.evaluation.fast_eval_api INFO: COCOeval_opt.evaluate() finished in 0.72 seconds. [03/22 08:26:44] d2.evaluation.fast_eval_api INFO: Accumulating evaluation results... [03/22 08:26:44] d2.evaluation.fast_eval_api INFO: COCOeval_opt.accumulate() finished in 0.09 seconds. [03/22 08:26:44] d2.evaluation.coco_evaluation INFO: Evaluation results for bbox: | AP | AP50 | AP75 | APs | APm | APl | |:------:|:------:|:------:|:------:|:------:|:------:| | 56.953 | 81.160 | 65.181 | 30.727 | 59.015 | 61.911 | [03/22 08:26:45] d2.evaluation.testing INFO: copypaste: Task: bbox [03/22 08:26:45] d2.evaluation.testing INFO: copypaste: AP,AP50,AP75,APs,APm,APl [03/22 08:26:45] d2.evaluation.testing INFO: copypaste: 56.9526,81.1597,65.1808,30.7273,59.0145,61.9109 [03/22 08:26:45] d2.utils.events INFO: eta: 1 day, 16:58:40 iter: 1999 total_loss: 10.29 loss_class: 0.1214 loss_bbox: 0.123 loss_giou: 0.4363 loss_class_0: 0.1368 loss_bbox_0: 0.1324 loss_giou_0: 0.4611 loss_class_1: 0.125 loss_bbox_1: 0.1214 loss_giou_1: 0.4421 loss_class_2: 0.122 loss_bbox_2: 0.1208 loss_giou_2: 0.4419 loss_class_3: 0.1206 loss_bbox_3: 0.1226 loss_giou_3: 0.4368 loss_class_4: 0.1202 loss_bbox_4: 0.1211 loss_giou_4: 0.4368 loss_class_enc: 0.129 loss_bbox_enc: 0.1544 loss_giou_enc: 0.537 loss_class_dn: 0.06884 loss_bbox_dn: 0.1719 loss_giou_dn: 0.551 loss_class_dn_0: 0.08121 loss_bbox_dn_0: 0.2271 loss_giou_dn_0: 0.7532 loss_class_dn_1: 0.072 loss_bbox_dn_1: 0.1797 loss_giou_dn_1: 0.604 loss_class_dn_2: 0.06933 loss_bbox_dn_2: 0.1738 loss_giou_dn_2: 0.5664 loss_class_dn_3: 0.06883 loss_bbox_dn_3: 0.1731 loss_giou_dn_3: 0.5586 loss_class_dn_4: 0.06712 loss_bbox_dn_4: 0.1723 loss_giou_dn_4: 0.553 time: 1.6991 data_time: 0.0783 lr: 8.75e-05 max_mem: 37231M [03/22 08:28:09] d2.utils.events INFO: eta: 1 day, 16:56:03 iter: 2049 total_loss: 9.54 loss_class: 0.1217 loss_bbox: 0.1232 loss_giou: 0.4149 loss_class_0: 0.1248 loss_bbox_0: 0.1417 loss_giou_0: 0.4668 loss_class_1: 0.1198 loss_bbox_1: 0.1337 loss_giou_1: 0.4492 loss_class_2: 0.1215 loss_bbox_2: 0.1259 loss_giou_2: 0.4305 loss_class_3: 0.1179 loss_bbox_3: 0.1253 loss_giou_3: 0.424 loss_class_4: 0.1199 loss_bbox_4: 0.1225 loss_giou_4: 0.4172 loss_class_enc: 0.1242 loss_bbox_enc: 0.1659 loss_giou_enc: 0.5383 loss_class_dn: 0.06349 loss_bbox_dn: 0.163 loss_giou_dn: 0.52 loss_class_dn_0: 0.07829 loss_bbox_dn_0: 0.2271 loss_giou_dn_0: 0.7151 loss_class_dn_1: 0.0686 loss_bbox_dn_1: 0.1765 loss_giou_dn_1: 0.5608 loss_class_dn_2: 0.06576 loss_bbox_dn_2: 0.1674 loss_giou_dn_2: 0.5289 loss_class_dn_3: 0.06457 loss_bbox_dn_3: 0.165 loss_giou_dn_3: 0.5256 loss_class_dn_4: 0.06355 loss_bbox_dn_4: 0.1632 loss_giou_dn_4: 0.5227 time: 1.6988 data_time: 0.0890 lr: 8.75e-05 max_mem: 37231M [03/22 08:29:07] d2.engine.hooks INFO: Overall training speed: 2081 iterations in 0:58:57 (1.6998 s / it) [03/22 08:29:07] d2.engine.hooks INFO: Total training time: 1:00:02 (0:01:05 on hooks) [03/22 08:29:07] d2.utils.events INFO: eta: 1 day, 16:57:54 iter: 2083 total_loss: 9.97 loss_class: 0.1355 loss_bbox: 0.1168 loss_giou: 0.4266 loss_class_0: 0.1365 loss_bbox_0: 0.1381 loss_giou_0: 0.4751 loss_class_1: 0.1273 loss_bbox_1: 0.1245 loss_giou_1: 0.4544 loss_class_2: 0.1323 loss_bbox_2: 0.1175 loss_giou_2: 0.4391 loss_class_3: 0.1337 loss_bbox_3: 0.1183 loss_giou_3: 0.4344 loss_class_4: 0.1358 loss_bbox_4: 0.1178 loss_giou_4: 0.4259 loss_class_enc: 0.1334 loss_bbox_enc: 0.1673 loss_giou_enc: 0.5563 loss_class_dn: 0.06892 loss_bbox_dn: 0.1693 loss_giou_dn: 0.5293 loss_class_dn_0: 0.07752 loss_bbox_dn_0: 0.2257 loss_giou_dn_0: 0.7245 loss_class_dn_1: 0.0679 loss_bbox_dn_1: 0.1816 loss_giou_dn_1: 0.5855 loss_class_dn_2: 0.06638 loss_bbox_dn_2: 0.1719 loss_giou_dn_2: 0.545 loss_class_dn_3: 0.06689 loss_bbox_dn_3: 0.17 loss_giou_dn_3: 0.5352 loss_class_dn_4: 0.0678 loss_bbox_dn_4: 0.1695 loss_giou_dn_4: 0.5295 time: 1.6991 data_time: 0.0769 lr: 8.75e-05 max_mem: 37231M