Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Add bindings for Device and MemoryManager classes and related methods #41126

Closed
3 tasks done
jorisvandenbossche opened this issue Apr 10, 2024 · 1 comment
Closed
3 tasks done

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Apr 10, 2024

Through the C++ foundations, PyArrow already has support for representing non-CPU data (e.g. you can have a pyarrow.Table object backed by buffers in GPU memory), but this is not yet exposed very well in the Python bindings. Specifically, over the last years, several features were added on the C++ side (the Device and MemoryManager classes, more recently a set of Copy functions that take a target memory manager) that don't yet have equivalent bindings in PyArrow.

jorisvandenbossche added a commit to jorisvandenbossche/arrow that referenced this issue May 16, 2024
jorisvandenbossche added a commit to jorisvandenbossche/arrow that referenced this issue May 16, 2024
jorisvandenbossche added a commit that referenced this issue May 31, 2024
#41685)

### Rationale for this change

Add bindings for the C++ `arrow::Device` and `arrow::MemoryManager` classes.

### What changes are included in this PR?

Basic bindings by adding the `pyarrow.Device` and `pyarrow.MemoryManager` classes, and just tested for CPU.

What is not included here are additional methods on the `MemoryManager` class (eg to allocate or copy buffers), and this is also not yet tested for CUDA. Planning to do this as follow-ups, and first doing those basic bindings should enable further enhancements to be done in parallel.

### Are these changes tested?

Yes, for the CPU device only.

* GitHub Issue: #41126

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@jorisvandenbossche jorisvandenbossche added this to the 17.0.0 milestone Jun 20, 2024
@jorisvandenbossche
Copy link
Member Author

Going to close this to keep track of it in the 17.0 milestone, given the main PR has been merged. And the last item has its own sub-issue that is still open.

jorisvandenbossche added a commit to jorisvandenbossche/arrow that referenced this issue Aug 8, 2024
jorisvandenbossche added a commit that referenced this issue Aug 9, 2024
)

### Rationale for this change

Adding tests for the new Buffer properties added in #41685 but now testing that it works out of the box with CUDA.

* GitHub Issue: #41126

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
jorisvandenbossche added a commit that referenced this issue Aug 21, 2024
…lasses (#42223)

### Rationale for this change

We have added bindings for the Device and MemoryManager classes (#41126), and as a next step we can expose the functionality to copy a full Array or RecordBatch to a specific memory manager.

### What changes are included in this PR?

This adds a `copy_to` method on pyarrow Array and RecordBatch.

### Are these changes tested?

Yes

* GitHub Issue: #42222

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant