Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Allocate UCX bounce buffers outside of RMM if ASYNC allocator is enabled #3603

Closed
abellina opened this issue Sep 22, 2021 · 0 comments
Closed
Labels
feature request New feature or request shuffle things that impact the shuffle plugin

Comments

@abellina
Copy link
Collaborator

Depends on: rapidsai/cudf#9270

Given the option to enable the ASYNC allocator (#3447), UCX in this mode will currently fail when attempting to use nv_peer_mem. The reason is that memory allocated with cudaMallocAsync does not support GPUDirect RDMA, and the pool we go to allocate memory from would be using this allocator exclusively.

When rapidsai/cudf#9270 goes in, we should be able to reserve the bounce buffer amount from the RMM pool, and allocate directly, bypassing RMM. This would allow us to use nv_peer_mem and UCX as we do today with other RMM allocators.

@abellina abellina added feature request New feature or request ? - Needs Triage Need team to review and classify shuffle things that impact the shuffle plugin labels Sep 22, 2021
@Salonijain27 Salonijain27 added this to the Sep 27 - Oct 1 milestone Sep 24, 2021
@Salonijain27 Salonijain27 removed the ? - Needs Triage Need team to review and classify label Sep 28, 2021
@rongou rongou closed this as completed Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request shuffle things that impact the shuffle plugin
Projects
None yet
Development

No branches or pull requests

3 participants