-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Leak #639
Comments
Was able to repro, don't need an answer anymore |
@tcapelle Thanks for the issue! Basically what's happening is that because the params in the model have (1) make params not require_grad
OR (2) use detach in sgd
|
I was looking for something like this, zero_grad or detach, but didn't know where to do it. Then a training tutorial should showcase this. I can work out with @soumik12345 an example to add. |
@samdow I was wondering if solution (1) could be an optional behaviour for |
Ohh, I would also prefer this as default. |
@vfdev-5 thanks for the idea! I really like that and it shouldn't be difficult to do if we plumb it through the
To @tcapelle's point, I think I'm personally on the side of making this the default but here's what I'm wrestling with: However, if a user does use PyTorch's autograd, they'll get a hard to decipher error message that "____ does not require grad and does not have a grad_fn" when they believe that it should work just like PyTorch. Either way leads to a debugging situation that isn't ideal (either the memory leak if we have the default to True or a hard to decipher error message for False). Thoughts on this are welcome! cc @zou3519 |
Having an argument to toggle this would be great. I'm not sure what the default should be but I would lean towards not changing it yet. Right now make_functional preserves requires_grad because that was helpful in some other use cases (e.g. maml, lennard-jones) where users wanted to use functorch transforms in the middle of their model and then backprop through the entire thing using regular PyTorch autograd |
@samdow sorry for delay. Yes, i can send a PR for that. I agree that default value should be True such that we do not alter current behaviour and just add an option. |
I will close this now. |
Hello!
I am thrilled with the functorch package, and have been playing with it lately.
With @soumik12345 we found a memory leak after training a NN. We documented our findings here:
http://wandb.me/functorch-intro
We are probably doing something wrong, but the memory increases after each epoch.
As the GPU is pretty monstrous we didn't notice this straight away, but it clearly fills up progresively. The stateful pytorch training loop does not produce this.
The text was updated successfully, but these errors were encountered: