-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda Memory Leak #7300
Comments
Looks like
Likely, there's some implementation bug. Assign to myself for further investigations |
Okay this is caused by a legacy design issue with CUDA runtime. Basically, Taichi used to use separate memory managers for different data types, for instance The stupid part is that these separate managers doesn't share the same memory pool - memory released from
It's gonna work, because the memory returned by Let's leave this issue open, and I'll start with unifying |
No longer an issue after: #7795 |
…cator (taichi-dev#7531) Issue: taichi-dev#7300 . ### Brief Summary --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…LlvmRuntimeExecutor (taichi-dev#7544) Issue: taichi-dev#7300 ### Brief Summary --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
I noticed a bug introduced by the PR #7008 which was supposed to fix #6924
Consider this code snippet
This should work, because the 2GB of temp memory should be freed immediately after use.
However it doesn't, and we get a crash:
I think I figured out the reason: in #7008, we made it so that temporary memories are allocated from the pre-allocated CUDA memory, however, when it is freed, we are still using
CUDADriver::get_instance().mem_free(...);
This means all of the pre-allocated memory we allocate for temporary uses will be returned to the driver.Since we haven't implemented a way to de-allocate memory from the pre-allocated pool, I think we should just revert #7008.
cc @jim19930609
The text was updated successfully, but these errors were encountered: