Reproducing issue on hybrid task cascade model #8008

jekim5418 · 2022-05-18T10:07:06Z

Hi,

Thanks for your contribution on object detection research!

In order to use HTC model, I downloaded mmdetection code and reproduced HTC with ResNet-50 backbone model.
However, the result was quite different from the paper and your log.
The AP score for small objects (APs) was 17.7 and for large objects (APl) was 56.9.
I think this difference is significant.

Is there any way to solve it?

Best,
Jungeun Kim

jbwang1997 · 2022-05-18T10:12:40Z

Hello @jekim5418. Could you follow the issue template to provide more information, such as config and your environment?

jekim5418 · 2022-05-19T01:07:56Z

Sorry for inconvenience.

Issue :

What command or script did you run?
I followed the guideline that you offered. This is the command that I run.
tools/dist_train.sh ./configs/htc/htc_r50_fpn_20e_coco.py 8 --work-dir=./reproducing/htc_r50_fpn_20e_coco/
What config dir you run?
I run htc_r50_fpn_20e_coco.py, which the directory is configs/htc/htc_r50_fpn_20e_coco.py
Did you make any modifications on the code or config? Did you understand what you have modified?
No, I didn't change anything.
What dataset did you use?
I use MSCOCO train dataset for training and the AP scores that I mentioned is required according to MSCOCO validation dataset.

Environment :
sys.platform: linux
Python: 3.8.11 (default, Aug 3 2021, 15:09:35) [GCC 7.5.0]
CUDA available: True
GPU 0,1,2,3,4,5,6,7: RTX A6000
CUDA_HOME: /usr
NVCC: Cuda compilation tools, release 9.1, V9.1.8
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.8.0+cu111
PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.0.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.0+cu111
OpenCV: 4.5.3
MMCV: 1.5.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.24.1+c72bc70

I installed PyTorch through pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html this code.

2022-05-03 12:52:38,732 - mmdet - INFO - Exp name: detectors_htc_r50_1x_coco.py
2022-05-03 12:52:38,733 - mmdet - INFO - Epoch(val) [12][625] bbox_mAP: 0.4890, bbox_mAP_50: 0.6770, bbox_mAP_75: 0.5320, bbox_mAP_s: 0.2990, bbox_mAP_m: 0.5300, bbox_mAP_l: 0.6460, bbox_mAP_copypaste: 0.489 0.677 0.532 0.299 0.530 0.646, segm_mAP: 0.4260, segm_mAP_50: 0.6520, segm_mAP_75: 0.4600, segm_mAP_s: 0.2130, segm_mAP_m: 0.4590, segm_mAP_l: 0.6230, segm_mAP_copypaste: 0.426 0.652 0.460 0.213 0.459 0.623

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

Thanks,
Jungeun Kim

jbwang1997 · 2022-05-19T04:04:15Z

Seems you test the checkpoint of epoch 12. The results of htc_r50_fpn_20e_coco.py should be evaluated with epoch 20 checkpoint.

jekim5418 · 2022-05-19T05:13:43Z

Thanks for your comment.

Actually, I also run the model DetectoRS which is based on HTC.
As you mentioned, I trained the model for 40 epochs and evaluated at 20 epochs.
I just got the results as follows.

OrderedDict([('segm_mAP', 0.424), ('segm_mAP_50', 0.65), ('segm_mAP_75', 0.456), ('segm_mAP_s', 0.214), ('segm_mAP_m', 0.455), ('segm_mAP_l', 0.62), ('segm_mAP_copypaste', '0.424 0.650 0.456 0.214 0.455 0.620')])

In order to get this score, I typed this command : tools/dist_test.sh ./configs/detectors/detectors_htc_r50_1x_coco.py ./reproducing/detectors_htc_r50_1x_coco_max_epochs_40/epoch_20.pth 1 --eval segm --show-dir ./show_test_out/

The APs and APl score is quite different from the paper and scores from mmdetection's log file.

Also, I found that for DetectoRS model's config in original official github, I found that epoch has to be set as 40.
So, I additionally append the evaluation result of DetectoRS at 40 epochs.

2022-05-19 12:13:13,971 - mmdet - INFO - Exp name: detectors_htc_r50_1x_coco.py
2022-05-19 12:13:13,972 - mmdet - INFO - Epoch(val) [40][625] bbox_mAP: 0.4800, bbox_mAP_50: 0.6680, bbox_mAP_75: 0.5180, bbox_mAP_s: 0.2840, bbox_mAP_m: 0.5170, bbox_mAP_l: 0.6390, bbox_mAP_copypaste: 0.480 0.668 0.518 0.284 0.517 0.639, segm_mAP: 0.4170, segm_mAP_50: 0.6420, segm_mAP_75: 0.4460, segm_mAP_s: 0.2040, segm_mAP_m: 0.4470, segm_mAP_l: 0.6080, segm_mAP_copypaste: 0.417 0.642 0.446 0.204 0.447 0.608

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

ZwwWayne · 2022-10-28T02:53:26Z

The overall AP should be correct. The AP small, medium, and large are different because the values in the log and paper are calculated based on the box areas rather than mask areas. This is a legacy issue caused by the pycocotools, and has been fixed in #4898. Therefore, the measured APs small/medium/large are correct and matched the recommended practice.

mm-assistant bot assigned jbwang1997 May 18, 2022

open-mmlab locked and limited conversation to collaborators Oct 28, 2022

ZwwWayne converted this issue into discussion #9161 Oct 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Reproducing issue on hybrid task cascade model #8008

Reproducing issue on hybrid task cascade model #8008

jekim5418 commented May 18, 2022

jbwang1997 commented May 18, 2022

jekim5418 commented May 19, 2022

jbwang1997 commented May 19, 2022

jekim5418 commented May 19, 2022 •

edited

Loading

ZwwWayne commented Oct 28, 2022

This issue was moved to a discussion.

This issue was moved to a discussion.

Reproducing issue on hybrid task cascade model #8008

Reproducing issue on hybrid task cascade model #8008

Comments

jekim5418 commented May 18, 2022

jbwang1997 commented May 18, 2022

jekim5418 commented May 19, 2022

jbwang1997 commented May 19, 2022

jekim5418 commented May 19, 2022 • edited Loading

ZwwWayne commented Oct 28, 2022

This issue was moved to a discussion.

jekim5418 commented May 19, 2022 •

edited

Loading