Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

Closed
abellina opened this issue Sep 22, 2021 · 3 comments · Fixed by #9311
Closed

[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

abellina opened this issue Sep 22, 2021 · 3 comments · Fixed by #9311
Assignees
Labels
feature request New feature or request Java Affects Java cuDF API.

Comments

@abellina
Copy link
Contributor

Given #9201, we are starting to experiment with the CUDA Async Allocator. cudaMallocAsync does not support GPUDirect RDMA, so we would like to be able to allocate some memory (i.e. bounce buffers) directly using cudaMalloc.

One approach to get this done is to expose a DeviceMemoryBuffer.cudaAllocate or DeviceMemoryBuffer.cudaMalloc, that implies bypassing RMM, in order to retain the GPUDirect RDMA functionality in certain cases (RAPIDS Spark with UCX)

@jrhemstad
Copy link
Contributor

Why bypass RMM? You can use more than resource at a time. So just use cuda_memory_resource to allocate your bounce buffer and cuda_async_resource for your other allocations.

@abellina
Copy link
Contributor Author

Why bypass RMM? You can use more than resource at a time. So just use cuda_memory_resource to allocate your bounce buffer and cuda_async_resource for your other allocations.

Everything we have right now assumes the RMM pool is a singleton, and on top of that, that there's a single resource. Because of that it seemed simpler to limit the RMM pool to the amount of free memory - the amount needed for GPUDirect RDMA, and allocating directly.

That said, we do want to make Rmm.java more generic (#9209). It would mean more work, but it may make it easier for others to consume, and would fit more or less with what you are saying. The downstream app (spark-rapids in this case), could compose the allocators it needs under Rmm.

@rongou rongou self-assigned this Sep 24, 2021
@beckernick beckernick removed the Needs Triage Need team to review and classify label Sep 24, 2021
@abellina abellina changed the title [FEA] [Java] Add a direct (bypassing RMM) way to allocate DeviceMemoryBuffer [FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers Sep 29, 2021
@abellina
Copy link
Contributor Author

Note the way this was implemented was via the cuda_memory_resource in RMM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Java Affects Java cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants