-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental results #50
Comments
Hi @lshssel, Hmmm batch_size 1 will be more difficult to fine-tune, but let's try. The main idea is that you want the improvement to saturate and then reduce the learning rate. Training continually after the improvement has saturated doesn't help at all (U_R just goes down and AP50 doesn't go up) but if you reduce the lr_drop too soon (before AP50 starts to saturate) then the K_AP50 is 'frozen' too soon and doesn't improve enough. Best, |
Thanks for the reply, I'll try later |
Hi @lshssel, |
one 2080Ti,for all experiments,batch_size=1 Looking forward to your suggestions! |
Hi @lshssel, Best, |
I evaluated with t1.3checkpoint40 and t1.2 has been removed. When training, obj_temp = 1.3,obj_loss_coef=8e-4. obj_temp=1.1 K_AP50=57.1914 U_R50=19.2453 obj_loss_coef=4e-4 K_AP50=57.9826 U_R50=19.2624 So t1.3 is probably the best result that a 2080Ti can show |
Hi @lshssel, I want to add this to the readme. Would you mind providing all the hyperparameters you changed? |
Hi, |
Hi @WangPingA, When you train a model with a different batch size, your results will vary. That is because your gradient updates will not be the same, as you use a different batch size. Variations of +-2 seem reasonable. lshssel also ran experiments with a 2080Ti, and got: If you are interested in applications, then perhaps my recent work, FOMO, will interest you; it is much less compute-heavy to train and will have relatively strong open-world performance by leveraging foundation object detection model. An easy upgrade there is to switch owl-vit to owlv2. Best, |
Hi,
Because our group only assigned me a 2080Ti, the training took a long time, for MOWODB's task 1, it took 43 hours.
Unfortunately, on training to the 35th epoch, wandb crashes, so its curve also stops at the 35th epoch.
However, the program is still running without errors, and the file "checkpoint0040.pth" is also generated in the end, and the program can run smoothly when I use this file to train task 2.
Below are the wandb graphs and hyperparameters, which don't work very well, and I may need to tune the parameters as close to the original performance as possible.
K_AP50 is 52.476, U_R50 is 21.042
The text was updated successfully, but these errors were encountered: