Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with newer pytorch versions #6

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jhairgallardo
Copy link

This fixes the performance issue that REMIND has when using newer pytorch versions.

The issue happens when using the 'step_lr_per_class' setting. The learning rate does not reset automatically after every class, meaning that each scheduler does not have its own learning rate; instead, it starts with the latest learning rate causing the learning rate to decrease rapidly. With the proposed changes, the learning rate for each scheduler is being reset manually. This fixes the issue.

Results on ImageNet CLS IID:

Code Pytorch version (GPU) Metric 100 200 300 400 500 600 700 800 900 1000 Omega
Original 1.3.1 seen_classes_top5 94.00 87.57 83.33 79.54 76.62 75.80 74.37 72.97 71.90 70.68 0.855
Original 1.12.1 seen_classes_top5 94.10 47.44 31.60 23.70 18.97 15.80 13.56 11.85 10.53 9.48 0.297
With changes 1.12.1 seen_classes_top5 94.10 87.26 82.51 79.88 77.08 75.56 74.01 72.92 72.15 70.77 0.854

We can see that running the original code on newer pytorch versions (1.12.1) yields poor performance. Using the proposed changes with pytorch 1.12.1 yields similar performance to the original code on pytorch 1.3.1.

I have tested the code changes with the following packages and versions:

  • Python 3.8.13
  • PyTorch (GPU) 1.12.1
  • torchvision 0.13.1
  • NumPy 1.21.5
  • FAISS (CPU) 1.5.2
  • CUDA 10.2 (also works with CUDA 11.3)
  • Scikit-Learn 1.0.2
  • Scipy 1.7.3
  • NVIDIA GPU

@tyler-hayes
Copy link
Owner

Thank you so much for this fix @jhairgallardo!

Do you know if the pull request code also reproduces results in the original PyTorch version (1.3.1)? If so, then I can merge the request.

@jhairgallardo
Copy link
Author

jhairgallardo commented Apr 29, 2023

Hi @tyler-hayes!
I have run the pull request code with Pytorch 1.3.1. Here are the results:

Code Pytorch version (GPU) Metric 100 200 300 400 500 600 700 800 900 1000 Omega
Original 1.3.1 seen_classes_top5 94.00 87.57 83.33 79.54 76.62 75.80 74.37 72.97 71.90 70.68 0.855
With changes 1.3.1 seen_classes_top5 94.02 87.59 83.16 79.53 77.11 75.73 74.48 72.74 71.76 71.00 0.855

As you can see, the code with the proposed changes also reproduces the original results using Pytorch 1.3.1.
Let me know if you have more questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants