Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Set RMM async allocator as default #4515

Closed
rongou opened this issue Jan 12, 2022 · 1 comment · Fixed by #4606
Closed

[FEA] Set RMM async allocator as default #4515

rongou opened this issue Jan 12, 2022 · 1 comment · Fixed by #4606
Assignees
Labels
feature request New feature or request

Comments

@rongou
Copy link
Collaborator

rongou commented Jan 12, 2022

Is your feature request related to a problem? Please describe.
Currently the RMM arena allocator is the default, but in some circumstances it can still run into OOM errors due to memory fragmentation. The async allocator, relying on cudaMallocAsync and cudaFreeAsync, can remap physical pages when running of GPU memory, thus is more resistant to memory fragmentation.

Describe the solution you'd like
Set the async allocator as the default.

Describe alternatives you've considered
Continue improving the arena allocator is possible, but since it doesn't remap memory, probably won't ever be as fragmentation resistance as the async allocator.

Additional context
Set this at the beginning of 22.04 for more testing.

@rongou rongou added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jan 12, 2022
@rongou rongou self-assigned this Jan 12, 2022
@rongou
Copy link
Collaborator Author

rongou commented Jan 14, 2022

Since cudaMallocAsync requires CUDA 11.2, we'll do a version check and only switch to the async allocator if the cuda driver is above 11.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants