-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Python bindings to pack/unpack #7601
Comments
This issue has been labeled |
Being worked on in #8153 |
Closes #7601 Adds a Python API for `pack`/`unpack`, so that we might be able to pack/unpack DataFrames in serialization: - `PackedColumns` is a Python representation of the `cudf::packed_columns` struct containing the struct itself along with some Python metadata for the DataFrame being packed; supports Dask/pickle serialization - `pack()` takes in a `Table` and returns a `PackedColumns` - `unpack()` takes in a `PackedColumns` and returns a `Table` cc @brandon-b-miller Authors: - Charles Blackmon-Luca (https://github.com/charlesbluca) Approvers: - Devavret Makkar (https://github.com/devavret) - https://github.com/brandon-b-miller - Karthikeyan (https://github.com/karthikeyann) - https://github.com/jakirkham URL: #8153
Is your feature request related to a problem? Please describe.
When shipping DataFrames over the wire or spilling them, it can be handy to pack them into a more compact single buffer first and then unpack them into multiple buffers at the other end.
Describe the solution you'd like
Recently this functionality was added at the C++ layer ( #7096 ). It would be good to have bindings to this for Python and use this in relevant
serialize
/deserialize
methods.Describe alternatives you've considered
We could do this packing elsewhere like in Distributed ( dask/distributed#3732 ). Though this would then not use the C++ implementation here. It also wouldn't solve this for other Python use cases
Additional context
We've discussed adding this to a config potentially ( #5311 ). Not sure if this is still needed given the newer C++ implementation
The text was updated successfully, but these errors were encountered: