-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving TestDeviceBuffer to pylibraft.common.device_ndarray
#1008
Moving TestDeviceBuffer to pylibraft.common.device_ndarray
#1008
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this change - I think its a solid improvement to the usability of the python bindings, while still not requiring cupy
Thanks @benfred! I'm thinking we should also provide a factory function that accepts the shape, order, and dtype and creates an empty numpy array (internally) to construct the dev_ndarr = device_ndarray.empty((50, 10), dtype=np.float32, order="C") Obviously, there's a drawback to using the device_ndarray since it requires a host/device copy. However, I think that's okay because we are giving the option of ease of use (eg have the function automatically return results) and speed (eg users can still explicitly pass in the outputs to keep everything on device and avoid using the device_ndarray) |
JFYI, I added the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good! I see your point about the extra h2d memory copy w/ device_ndarray.empty
- but I agree that users can always provide pre-allocated arrays to get around this
Co-authored-by: Ben Frederickson <[email protected]>
@gpucibot merge |
This allows the
pylibraft
APIs to process outputs in place or allocate and return new device memory for outputs in a way which benefits from__cuda_array_interface__
compliance. Ultimatley, it allows us to stay decoupled from libraries like cupy while still affording the user the benefits of not having to explicitly allocate outputs up front.Closes #1002