-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU memory usage #3
Comments
Could you try run distributed training using only 1 gpu? The reason might be load the model in a single GPU multiple times. |
Make sure you run the code using the script given in README |
Thanks for your reply. I have try training with only 1 GPU by the command |
If you train on multiple GPUs, are GPU memory usage roughly the same for every GPU? My model is trained on TiTAN X, and its memory is only 12 GB. Maybe you can print out real GPU memory assumption using pytorch APIs, sometimes pytorch will allocate more GPU than needed. |
Actually the memory allocated is about 10G, but I don't know why the GPU memory usage is about 18G. |
Maybe pytorch will pre-allocate GPU memory for future usage, which will not be freed automatically. Potential solutions include explicitly limiting GPU memory usage or torch.cuda.empty_cache() to free the cache. |
Thanks for help. I tried |
你好,能问一下,gpu oom的问题解决了吗 |
The GPU memory usage reported in your paper is about 10G, but the GPU memory usage on my machine is about 18G when I train the model. Is there some different setting in the repo with your paper?
![image](https://user-images.githubusercontent.com/68363091/140509631-eaa89da0-90ec-476c-a9c6-f7a439eef39f.png)
The text was updated successfully, but these errors were encountered: