Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About DB-Swin-B model, I tried to reimplementate it and get 55.2 mAP #45

Open
JR-Wang opened this issue Sep 29, 2021 · 13 comments
Open

Comments

@JR-Wang
Copy link

JR-Wang commented Sep 29, 2021

I re-trained DB-Swin-B model on coco dataset. The config file I use is configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py, and the pretrained model I use is swin-base model trained on 22k imagenet, 384 input size. But the mAP I finally get is 55.2, lower than the result you provided in your readme.md (58.4).
What's wrong with my reimplementation?

@chuxiaojie
Copy link
Collaborator

We use checkpoint named swin_base_patch4_window7_224_22k.pth from https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window7_224_22k.pth

@JR-Wang
Copy link
Author

JR-Wang commented Sep 30, 2021

ok, I will try this pretrained model. Many thanks

@cailk
Copy link

cailk commented Oct 11, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

@JR-Wang
Copy link
Author

JR-Wang commented Oct 13, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

@cailk
Copy link

cailk commented Oct 13, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

Haven't yet. Since I don't have enough GPUs either, I need to make sure which config will work to reproduce 59.6AP on COCO-val.

BTW, you are now training a DB-Swin-B model, right? Could you let me know the results as soon as you have finished the training? Big thanks!~

@JR-Wang
Copy link
Author

JR-Wang commented Nov 1, 2021

We use checkpoint named swin_base_patch4_window7_224_22k.pth from https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window7_224_22k.pth

Hello, I changed the pre-trained model. This time I got mAP 55.8, only 0.6 higher than last time. May I ask how many GPUs you used when you train your model? And should learning rate keep proportion to GPU number?
Another question is, I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model. So how do you train your swin-L model? I have already used fp16 mode to reduce computation.

@JR-Wang
Copy link
Author

JR-Wang commented Nov 1, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

Haven't yet. Since I don't have enough GPUs either, I need to make sure which config will work to reproduce 59.6AP on COCO-val.

BTW, you are now training a DB-Swin-B model, right? Could you let me know the results as soon as you have finished the training? Big thanks!~

I finished training process to get 55.8 mAP, only 0.6 higher than last time. I wonder if I should decay learning rate since I don't have enough GPUs.

@cailk
Copy link

cailk commented Nov 1, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

Haven't yet. Since I don't have enough GPUs either, I need to make sure which config will work to reproduce 59.6AP on COCO-val.
BTW, you are now training a DB-Swin-B model, right? Could you let me know the results as soon as you have finished the training? Big thanks!~

I finished training process to get 55.8 mAP, only 0.6 higher than last time. I wonder if I should decay learning rate since I don't have enough GPUs.

I have finished the training of Swin-Large, which could achieve 58.8 mAP. This is pretty close to the result reported in the paper (59.1 mAP). And I use 8 V100 GPUs with batchsize=1. I also have trained a CBNet-Swin-base. However, because of my stupidity, I loaded a wrong Swin-Base pre-train checkpoint. I guess the results of CBNet-Swin-Base are also reproducible if you keep every hyper-parameters unchanging.

@JR-Wang
Copy link
Author

JR-Wang commented Nov 1, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

Haven't yet. Since I don't have enough GPUs either, I need to make sure which config will work to reproduce 59.6AP on COCO-val.
BTW, you are now training a DB-Swin-B model, right? Could you let me know the results as soon as you have finished the training? Big thanks!~

I finished training process to get 55.8 mAP, only 0.6 higher than last time. I wonder if I should decay learning rate since I don't have enough GPUs.

I have finished the training of Swin-Large, which could achieve 58.8 mAP. This is pretty close to the result reported in the paper (59.1 mAP). And I use 8 V100 GPUs with batchsize=1. I also have trained a CBNet-Swin-base. However, because of my stupidity, I loaded a wrong Swin-Base pre-train checkpoint. I guess the results of CBNet-Swin-Base are also reproducible if you keep every hyper-parameters unchanging.

Did you meet OOM when you train Swin-L? I alse tried to train Swin-L with each gpu one sample and got OOM on V100. Could you share your config file of Swin-L? Many thanks.

@cailk
Copy link

cailk commented Nov 1, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

Haven't yet. Since I don't have enough GPUs either, I need to make sure which config will work to reproduce 59.6AP on COCO-val.
BTW, you are now training a DB-Swin-B model, right? Could you let me know the results as soon as you have finished the training? Big thanks!~

I finished training process to get 55.8 mAP, only 0.6 higher than last time. I wonder if I should decay learning rate since I don't have enough GPUs.

I have finished the training of Swin-Large, which could achieve 58.8 mAP. This is pretty close to the result reported in the paper (59.1 mAP). And I use 8 V100 GPUs with batchsize=1. I also have trained a CBNet-Swin-base. However, because of my stupidity, I loaded a wrong Swin-Base pre-train checkpoint. I guess the results of CBNet-Swin-Base are also reproducible if you keep every hyper-parameters unchanging.

Did you meet OOM when you train Swin-L? I alse tried to train Swin-L with each gpu one sample and got OOM on V100. Could you share your config file of Swin-L? Many thanks.

I just use the original Swin-Large config file in this repo swin-large, and only set use_checkpoint=True. I use an 8*32g V100 machine with batchsize=1.

@JR-Wang
Copy link
Author

JR-Wang commented Nov 1, 2021

ok, I will try this pretrained model. Many thanks

Hi, have you tried the new pretrained model to reproduce Swin-B results? Does it work?

I don't have enough V100 now, so I haven't finish my training. I saw your question "Reproduce DB-Swin-Large models #47". So have you tried to train a swin-L model? I found that when I use V100 to train swin-B model, the memory used is almost 100%. Obviously it will OOM to train swin-L model.

Haven't yet. Since I don't have enough GPUs either, I need to make sure which config will work to reproduce 59.6AP on COCO-val.
BTW, you are now training a DB-Swin-B model, right? Could you let me know the results as soon as you have finished the training? Big thanks!~

I finished training process to get 55.8 mAP, only 0.6 higher than last time. I wonder if I should decay learning rate since I don't have enough GPUs.

I have finished the training of Swin-Large, which could achieve 58.8 mAP. This is pretty close to the result reported in the paper (59.1 mAP). And I use 8 V100 GPUs with batchsize=1. I also have trained a CBNet-Swin-base. However, because of my stupidity, I loaded a wrong Swin-Base pre-train checkpoint. I guess the results of CBNet-Swin-Base are also reproducible if you keep every hyper-parameters unchanging.

Did you meet OOM when you train Swin-L? I alse tried to train Swin-L with each gpu one sample and got OOM on V100. Could you share your config file of Swin-L? Many thanks.

I just use the original Swin-Large config file in this repo swin-large, and only set use_checkpoint=True. I use an 8*32g V100 machine with batchsize=1.

I also use 32g V100. Maybe I should check whether my fp16 setting is right. Thank you for your reply.

@foolhard
Copy link

I re-trained DB-Swin-B model on coco dataset. The config file I use is configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py, and the pretrained model I use is swin-base model trained on 22k imagenet, 384 input size. But the mAP I finally get is 55.2, lower than the result you provided in your readme.md (58.4). What's wrong with my reimplementation?

How to use Swin-Base model pretrained on 22k Imagenet in CBNetV2? How do you modify the config file?

@cailk
Copy link

cailk commented May 14, 2022

I re-trained DB-Swin-B model on coco dataset. The config file I use is configs/cbnet/htc_cbv2_swin_base_patch4_window7_mstrain_400-1400_giou_4conv1f_adamw_20e_coco.py, and the pretrained model I use is swin-base model trained on 22k imagenet, 384 input size. But the mAP I finally get is 55.2, lower than the result you provided in your readme.md (58.4). What's wrong with my reimplementation?

How to use Swin-Base model pretrained on 22k Imagenet in CBNetV2? How do you modify the config file?

Hi, sorry for the late reply. I have checked my implementation, and it looks like the pre-training checkpoint should be swin_base_patch4_window7_224_22k. BTW, you can contact me by email for the config file~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants