-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoid calling gc.collect
and cuda.empty_cache
#34514
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this seems like a good speed fix! cc @LysandreJik @ArthurZucker for core maintainer review
gc.collect
and cuda.empty_cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smart! Should a helper method be made that only runs on CPU?
Both the gc.collect and the torch device checks could be moved into the backend_empty_cache
method (or an other method that wraps both)
Yes, a helper method is nice. Will update |
82e3add
to
6620320
Compare
updated. So far it doesn't call |
6620320
to
4403c5a
Compare
1f47700
to
18d6d5d
Compare
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
* update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>
* update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>
* update * update * update * update * update --------- Co-authored-by: ydshieh <[email protected]>
What does this PR do?
Let's avoid calling
gc.collect
andcuda.empty_cache
while the tests are running on CPU:Running on GPT2 tests,
60 seconds on main, 20 seconds on this PR