Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training details #14

Open
jjhooon opened this issue Nov 6, 2024 · 0 comments
Open

Training details #14

jjhooon opened this issue Nov 6, 2024 · 0 comments

Comments

@jjhooon
Copy link

jjhooon commented Nov 6, 2024

Thank you for sharing wonderful work! I have some questions about training details.

  1. In the paper, the entire model is trained on a single A6000 GPU for 40,000 iterations with batch size 16. What is exact meaning of iteration in this line? Was it expressed as 40,000 iterations with a batch size of 16 because there are approximately 600,000 scenes? Because there are 8 million images within 600,000 scenes, if the batch size is set to 16, the number of iterations would be much greater than 40,000. This makes me confused about the meaning behind this expression. Also, there is no number of iterations in config file.

  2. In addition, this model is trained with 20 epochs? I am curious because the paper only provides explanations regarding the number of iterations and does not mention epochs.

  3. The other one is the training time. When I run the model in 4 RTX 4090 with batch size 8, it requires much more than times you mentioned in the paper, In my case, it requires almost 3 days for one epoch.b

  4. By running the official code, not modifying training scheme, model is converged very quickly. Starting from the first validation at 500 steps, the following metric results are obtained. Is this a normal phenomenon?
    스크린샷 2024-11-06 오후 4 25 01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant