[DOC] Document memory resource lifetime requirements when taking ownership of device buffers in Python #1492

wence- · 2024-03-04T11:16:34Z

An rmm::device_buffer holds a (effectively) raw pointer to the memory resource used for its allocation and deallocation. On the C++ side, it is the responsibility of the constructor a device_buffer to ensure that the memory resource lives at least as long as the buffer.

When we expose a device_buffer to Python through the cython DeviceBuffer class (specifically, when we use DeviceBuffer.c_from_unique_ptr which is used in many places in cudf), we have no way of inspecting the provenance of the memory resource used to allocate that buffer, so we have no way of keeping it alive.

Right now, the way we handle this in the cython wrappers is to throw our hands up in the air, grab the current_device_resource, cross our fingers and hope.

This has worked so far because usually no-one switches out a memory resource from under us, and generally speaking in cudf we're taking ownership of buffers that have ultimately been allocated on the python side, but it's not actually safe.

Unless we're willing to make the memory resource slot in a device buffer hold a smart pointer, and I believe there are good reasons why we do not do that at the moment, the best we can do is document the requirements on the Python side and then obey them when using RMM from cudf.

The text was updated successfully, but these errors were encountered:

vyasr · 2024-03-04T17:52:38Z

For a cudf-specific example and discussion, see rapidsai/cudf#14229

jrhemstad · 2024-03-04T23:53:21Z

When we expose a device_buffer to Python through the cython DeviceBuffer class (specifically, when we use DeviceBuffer.c_from_unique_ptr which is used in many places in cudf), we have no way of inspecting the provenance of the memory resource used to allocate that buffer

You can't inspect the provenance after the fact, no, but you control what resource is used to construct the buffer in the first place, yes?

A sketch of what I mean:

struct buffer_that_owns_resource{
   device_buffer b_;
   std::shared_ptr<device_memory_resource> mr_;
};


std::shared_ptr<device_memory_resource> mr{...};
device_buffer b = cudf::some_function(..., mr.get());

auto owning = buffer_that_owns_resource{std::move(b), mr};

This enables you to have the effective equivalent of if device_buffer owned its mr.

wence- · 2024-03-05T07:54:32Z

You can't inspect the provenance after the fact, no, but you control what resource is used to construct the buffer in the first place, yes?

We do, although we don't currently use what you propose (or a moral equivalent thereof) in cudf-python for (I presume) historical reasons.

However, since we're continuing to advocate for rmm as a cross-library allocator, we need to make these requirements (I think) clearer in the docs than they are currently.

On the C++ side, device_buffers store raw pointers for the memory resource that was used in their allocation. Consequently, it is unsafe to take ownership of a device_buffer in Python unless we controlled the provenance of the memory resource that was used in its allocation. The only way to do that is to pass the memory resource from Python into C++ and then use it when constructing the DeviceBuffer. Add discussion of this with some examples and a section on pitfalls if only relying on get_current_device_resource and set_current_device_resource. - Closes rapidsai#1492

…hip requirements in Python/C++ interfacing (#1552) On the C++ side, device_buffers store raw pointers for the memory resource that was used in their allocation. Consequently, it is unsafe to take ownership of a device_buffer in Python unless we controlled the provenance of the memory resource that was used in its allocation. The only way to do that is to pass the memory resource from Python into C++ and then use it when constructing the DeviceBuffer. Add discussion of this with some examples and a section on pitfalls if only relying on get_current_device_resource and set_current_device_resource. To allow Python users of `DeviceBuffer` objects to follow best practices, introduce explicit (defaulting to the current device resource) `mr` arguments in both `c_from_unique_ptr` and the `DeviceBuffer` constructor. - Closes #1492 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) URL: #1552

wence- added doc Documentation ? - Needs Triage Need team to review and classify labels Mar 4, 2024

wence- mentioned this issue Mar 4, 2024

[FEA] Be consistent in handling of default parameter values in pylibcudf rapidsai/cudf#15198

Open

harrism added this to RMM Project Board Mar 4, 2024

github-project-automation bot moved this to Todo in RMM Project Board Mar 4, 2024

jrhemstad mentioned this issue Mar 6, 2024

[EPIC] Roadmap for cuda/memory_resource NVIDIA/cccl#1502

Open

12 tasks

This was referenced Apr 2, 2024

[BUG] When using rmm_cupy_allocator with Cupy, the memory resource is constructed twice. #1506

Closed

[FEA] Ensure that Cython wrappers for allocating objects correctly expose and use memory resources #1515

Open

harrism assigned wence- Apr 23, 2024

wence- removed the ? - Needs Triage Need team to review and classify label May 1, 2024

wence- mentioned this issue May 3, 2024

Allow specifying mr in DeviceBuffer construction, and document ownership requirements in Python/C++ interfacing #1552

Merged

3 tasks

rapids-bot bot closed this as completed in #1552 May 16, 2024

github-project-automation bot moved this from Todo to Done in RMM Project Board May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] Document memory resource lifetime requirements when taking ownership of device buffers in Python #1492

[DOC] Document memory resource lifetime requirements when taking ownership of device buffers in Python #1492

wence- commented Mar 4, 2024

vyasr commented Mar 4, 2024

jrhemstad commented Mar 4, 2024 •

edited

Loading

wence- commented Mar 5, 2024

[DOC] Document memory resource lifetime requirements when taking ownership of device buffers in Python #1492

[DOC] Document memory resource lifetime requirements when taking ownership of device buffers in Python #1492

Comments

wence- commented Mar 4, 2024

vyasr commented Mar 4, 2024

jrhemstad commented Mar 4, 2024 • edited Loading

wence- commented Mar 5, 2024

jrhemstad commented Mar 4, 2024 •

edited

Loading