Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Move cudf.Buffer to rmm #227

Open
shwina opened this issue Jan 7, 2020 · 28 comments
Open

[QST] Move cudf.Buffer to rmm #227

shwina opened this issue Jan 7, 2020 · 28 comments
Assignees
Labels
inactive-30d inactive-90d question Further information is requested

Comments

@shwina
Copy link
Contributor

shwina commented Jan 7, 2020

Question regarding moving cudf.Buffer to rmm:

  • rmm.DeviceBuffer is a Cython wrapper around the C++ class rmm::device_buffer.
  • cudf.Buffer more generally represents an untyped device memory allocation:
buf = Buffer(data=ptr, size=size, owner=python_obj)

# buf represents a device memory allocation
# with address `ptr`, of size `size` bytes,
# and keeps a reference to `python_obj`
# who is the owner of that memory.
# For RMM allocated memory, the owner
# is a `rmm.DeviceBuffer

A Buffer can be constructed from any object exposing __array_interface__ or __cuda_array_interface__, e.g., CuPy arrays, numpy arrays, etc.,

Does it make sense for Buffer to be moved to RMM?

@shwina shwina added the doc Documentation label Jan 7, 2020
@shwina shwina changed the title [FEA] Move cudf.Buffer to rmm [QST] Move cudf.Buffer to rmm Jan 7, 2020
@shwina shwina added question Further information is requested and removed doc Documentation labels Jan 7, 2020
@jakirkham
Copy link
Member

Current cuDF implementation is here for context.

@jrhemstad
Copy link
Contributor

Granted it's on the Python level where I'm not too concerned, but this kinda sounds like scope creep to me.

My hope would be to keep RMM as narrowly focused as we can.

@jakirkham
Copy link
Member

Maybe a little. Though am less worried about that personally. Where else would you imagine it living (if not RMM)?

@jrhemstad
Copy link
Contributor

jrhemstad commented Jan 7, 2020

I'm a big fan of "Do One Thing". Scope creep is how libraries become complicated behemoths that are difficult to maintain and change.

This issue is similar to why I pushed back on #220. RMM shouldn't be concerned about device memory allocated by anything other than RMM. Anything beyond that is the concern of another layer or library.

I don't have sufficient Python expertise to know where Buffer should live, but I'm not a big fan of just tacking it onto RMM simply because it is convenient.

@kkraus14
Copy link
Contributor

kkraus14 commented Jan 9, 2020

From my view I'm not viewing this as tacking this onto rmm because it's convenient, in fact it's a little bit inconvenient if anything. I view this as managing memory that rapids libraries could potentially use, it just so happens that the memory wasn't allocated by the RMM allocator.

@github-actions
Copy link

This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@github-actions
Copy link

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

@harrism
Copy link
Member

harrism commented Feb 16, 2021

Can you just replace cudf.Buffer with rmm.DeviceBuffer?

@shwina
Copy link
Contributor Author

shwina commented Feb 17, 2021

Unfortunately no, because a cudf.Buffer could contain arbitrary device memory backed objects underneath it, such as CuPy/Numba arrays, whereas a rmm.DeviceBuffer is owning.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@harrism
Copy link
Member

harrism commented Nov 18, 2021

Unfortunately no, because a cudf.Buffer could contain arbitrary device memory backed objects underneath it, such as CuPy/Numba arrays, whereas a rmm.DeviceBuffer is owning.

This sounds like what in C++ we call span.

@jakirkham
Copy link
Member

Yeah that seems similar.

Basically cudf.Buffer is a non-owning view onto the memory. It holds a reference to the object that owns the memory.

@vyasr
Copy link
Contributor

vyasr commented Nov 18, 2021

I hadn't seen this issue before, but I recently started thinking about this after a conversation with @shwina and @jrhemstad about GPU memory management in Python. It seems like what we're looking for is a generic gpumemoryview with similar semantics to Python's memoryview, i.e. a standard representation for a non-owning view into memory. This object would be agnostic to the source of the underlying allocation. Most libraries could operate on the view alone (similar to libcudf C++ algorithms) and Python's reference counting logic would handle issues of scope. If a library needed to allocate memory internally (e.g. upon creation of a cudf.DataFrame) it could create the required rmm buffers then immediately generate the view and store it under the dataframe's columns exactly as cudf.Buffer is used right now. This approach would allow us to think bigger than RAPIDS alone so that we could also naturally consume data owned by CuPy, PyTorch, TensorFlow, and others.

If we went this route there would be a number of open questions to address here. Such an object could work directly with __cuda_array_interface__, but that would differ from how Python currently works: a memoryview is designed to work with the Python buffer protocol at the C level, while numpy (and numpy-like array libraries) are what consume __array_interface__. On the other hand, developing a GPU analog to the buffer protocol is probably unnecessary given the degree to which the ecosystem has centralized around CAI, so that semantic inconsistency is probably preferable to trying to overengineer a C API. There's also the original question on this thread of where such an object might live. I would probably advocate for a standalone package at least to start with, but I could see a case for rolling it into CUDA Python or something related to it.

CC @gmarkall @leofang

@jakirkham
Copy link
Member

Yeah we already did this in ucx-py with Array. We could refactor this out into a separate thing we depend on.

cc @pentschev @madsbk @quasiben

@leofang
Copy link
Member

leofang commented Nov 19, 2021

I have been devising an interface for C/C++ libraries to share a memory pool, and I would need a way to expose this interface all the way to Python so that users can set it from Python. This sounds like a good idea worth exploring.

@jrhemstad
Copy link
Contributor

i.e. a standard representation for a non-owning view into memory.

I thought that was the purpose of CAI? What's the difference between gpumemoryview and CAI?

@jakirkham
Copy link
Member

It can help to have something in Cython as it can expose Python & C APIs

@vyasr
Copy link
Contributor

vyasr commented Nov 19, 2021

That's true, but in my view the distinction is a little more fundamental. CAI is purely descriptive, so objects implementing the CAI must always be converted into some internal representation like cudf.Buffer that can be operated on programmatically. A gpumemoryview would replace that so that various libraries don't need to reinvent the wheel. To @jrhemstad's (much earlier) point about scope creep, I agree that the idea of a generic view into memory is out of scope for RMM because the underlying allocation need not come from RMM. CAI was developed so that different libraries don't need to implement a standard API in order to be interoperable. Having a standalone object representation of the allocation represented by CAI would improve that interoperability and provide a standard interchange object rather than a format alone. As long as the object itself supported the CAI we could pass it to libraries that support CAI even if they were unaware of gpumemoryview and it would just work, but libraries could optionally build around gpumemoryview (instead of e.g. cudf.Buffer) and avoid needing to perform any preprocessing when provided one.

@jakirkham
Copy link
Member

Yep I understand. The Array object in UCX-Py does both those things. We can rip it out into a stand alone library

@vyasr
Copy link
Contributor

vyasr commented Nov 19, 2021

@jakirkham Sorry, didn't mean that as a correction, more as an extended explanation to answer @jrhemstad's question.

@madsbk
Copy link
Member

madsbk commented Nov 22, 2021

Yep I understand. The Array object in UCX-Py does both those things. We can rip it out into a stand alone library

I am very much in favor of a stand alone library implementing something like Array and all its nice-to-have utility functions.

@harrism
Copy link
Member

harrism commented Nov 22, 2021

Happy my comment helped kickstart this discussion again! Sounds like there is agreement on a direction forward.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@vyasr vyasr self-assigned this Jan 5, 2022
@vyasr
Copy link
Contributor

vyasr commented Jan 5, 2022

I'm going to work on documenting next steps for this sometime soon.

@github-actions
Copy link

github-actions bot commented Feb 4, 2022

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions
Copy link

github-actions bot commented May 5, 2022

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@bdice
Copy link
Contributor

bdice commented Jan 20, 2025

There is some recent work in cuda.core on StridedMemoryView that seems relevant to this issue.

NVIDIA/cuda-python#87
NVIDIA/cuda-python#180
https://github.com/NVIDIA/cuda-python/issues?q=stridedmemoryview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive-30d inactive-90d question Further information is requested
Projects
Status: No status
Development

No branches or pull requests

9 participants