-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is 'Scale loss by nominal batch_size of 64'? #507
Comments
64 is the nominal batch size of darknet. This way you can use different --batch-size --accumulate combinations to maintain a 64 image batch-size even with smaller graphics cards. |
Ahh I see, does that mean I should only use a combination of --batch-size and --accumulate that produces 64? I was going through CUSTOM TRAINING EXAMPLE & SINGLE-CLASS TRAINING EXAMPLE, the --batch-size and --accumulate value that you have specified don't multiply to the value '64'. |
@maxmx911 you can experiment with different batch sizes also of course. If you comment out that line, then smaller batch sizes will produce faster training, but with a worse plateau. Larger batch-sizes produce slower but less noisy training and tend to plateau to better results. The tutorials do have smaller batch sizes, since they are very tiny datasets, i.e. only 16 or 64 images in the dataset. |
@glenn-jocher I'm sorry I don't really understand, I'm still very new to this. Will it be the same with having
and also
|
@maxmx911 yes your two lines are the same as the code is now. Your second line will train slower but will use less GPU memory, allowing you to train on larger images for example. |
Okay, so if I change the batch size to 12 by doing 4 * 3 = 12
as I'm training with a small dataset of 100+ images as well, do I need to change anything in this line |
@maxmx911 just leave it. |
@glenn-jocher The purpose of dividing it by 64 is it due to original darknet is configured with 64 batch size, and if I'm doing any batch size other than 64, I divide it with 64 to make my result looks like it is being trained with 64 batch size? If it ain't so, can you explain a little of why divide it by 64? Again, I'm sorry for being annoying as I'm new to this and I'm trying to make sense out of it. |
@maxmx911 it's simply the darknet default. Take it or leave it, it really depends on your own custom situation. Try it out both ways and use what works best. |
Hi, may I ask what is the purpose of this scale loss by nominal batch_size and where did the value '64' come from?
It is from train,py line 273 & line 274
The text was updated successfully, but these errors were encountered: