-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance drop when training yolov3 on voc #1323
Comments
@Qbjun Hi,
|
@AlexeyAB glad to see your reply.
|
Yes, during Training - original repository uses letterbox() for resizing, while my repository uses resize(): #232 (comment) |
@AlexeyAB well, I guess i have to check the code again, i think i made some mistakes when reading the code. I'll train voc again and post the result later. thx again for the reply~ |
Hi, @AlexeyAB! Trying to find a thread on multi-scale training. I am not sure how the original darknet does that. I understand that when you specify the height and width in the cfg file for example, 608x608, the images being fed to the algorithm is squeezed but the ratio is retained by using letterbox. Does YOLO v3 just feed on different sizes of images and use this process, thus it is called multi-scale training? Moreover, will the accuracy suffer if I feed an image that has lower dimension than the height and width or will the letterbox take care of it? Thank you! |
@kenrubiooo Hi, There are 2 features for multi-scale training in Darknet Yolo:
Letterbox takes care of it. |
Thank you for the swift reply! I am actually a bit confused on the network resizing when 'random=1'. What is the resizing in that part if you set the height and width to a certain dimension? At first, what I need to do for 'random=1' is to set 3 heights and widths, i.e. height = 1024, 512, 320 But I am a bit confused. Can you clarify this? Thank you very much!
|
Hi, Alexey. I used the cfg file, yolov3-voc.cfg provided in the repo and trained yolov3 on VOC. I prepared the dataset and the labels following the instructions on pjreddie's official site. After training on your version of Darknet, I got 78% map. However, after I trained with the same cfg on the official darknet ,I can get about 82% map. I wonder where the difference is that causes such a drop in performance.
I do found some differences such as the strategy for multiscale training, but changing that didn't bring a good performance either.
I didnt change the yolov3-voc.cfg except uncommenting the training batch and subdivison and changing the batch to 96 (in order to fill my gpus). btw we use our own code to compute map.
Thanks
The text was updated successfully, but these errors were encountered: