[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

abellina · 2021-09-22T14:24:09Z

Given #9201, we are starting to experiment with the CUDA Async Allocator. cudaMallocAsync does not support GPUDirect RDMA, so we would like to be able to allocate some memory (i.e. bounce buffers) directly using cudaMalloc.

One approach to get this done is to expose a DeviceMemoryBuffer.cudaAllocate or DeviceMemoryBuffer.cudaMalloc, that implies bypassing RMM, in order to retain the GPUDirect RDMA functionality in certain cases (RAPIDS Spark with UCX)

The text was updated successfully, but these errors were encountered:

jrhemstad · 2021-09-22T16:56:19Z

Why bypass RMM? You can use more than resource at a time. So just use cuda_memory_resource to allocate your bounce buffer and cuda_async_resource for your other allocations.

abellina · 2021-09-22T18:34:44Z

Why bypass RMM? You can use more than resource at a time. So just use cuda_memory_resource to allocate your bounce buffer and cuda_async_resource for your other allocations.

Everything we have right now assumes the RMM pool is a singleton, and on top of that, that there's a single resource. Because of that it seemed simpler to limit the RMM pool to the amount of free memory - the amount needed for GPUDirect RDMA, and allocating directly.

That said, we do want to make Rmm.java more generic (#9209). It would mean more work, but it may make it easier for others to consume, and would fit more or less with what you are saying. The downstream app (spark-rapids in this case), could compose the allocators it needs under Rmm.

Fixes #9270 Authors: - Rong Ou (https://github.com/rongou) Approvers: - Jason Lowe (https://github.com/jlowe) - Alessandro Bellina (https://github.com/abellina) URL: #9311

abellina · 2021-09-29T23:02:09Z

Note the way this was implemented was via the cuda_memory_resource in RMM.

abellina added feature request New feature or request Needs Triage Need team to review and classify Java Affects Java cuDF API. labels Sep 22, 2021

abellina mentioned this issue Sep 22, 2021

[FEA] Allocate UCX bounce buffers outside of RMM if ASYNC allocator is enabled NVIDIA/spark-rapids#3603

Closed

rongou self-assigned this Sep 24, 2021

beckernick removed the Needs Triage Need team to review and classify label Sep 24, 2021

rongou mentioned this issue Sep 25, 2021

Add CudaMemoryBuffer for cudaMalloc memory using RMM cuda_memory_resource #9311

Merged

rapids-bot bot closed this as completed in #9311 Sep 27, 2021

abellina changed the title ~~[FEA] [Java] Add a direct (bypassing RMM) way to allocate DeviceMemoryBuffer~~ [FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers Sep 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

abellina commented Sep 22, 2021

jrhemstad commented Sep 22, 2021

abellina commented Sep 22, 2021

abellina commented Sep 29, 2021

[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

[FEA] [Java] Add a way to allocate via cudaMalloc for device memory buffers #9270

Comments

abellina commented Sep 22, 2021

jrhemstad commented Sep 22, 2021

abellina commented Sep 22, 2021

abellina commented Sep 29, 2021