-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cython/Python copy_to_host
and to_device
#268
Conversation
Pad the size of the memoryview to workaround Cython issue creating size 0 memoryviews. Then trim the added length afterwards. Should allow us to pass size 0 memoryviews to `copy_to_host`.
This should avoid raising unnecessarily when the host and/or device pointers are `nullptr`.
Go ahead and pull these into Cython for simplicity. After all we are basically just calling `cudaMemcpyAsync` at this point.
Creates an analogous method to `copy_to_host` on `DeviceNDArray`s.
Is that still not added or did I miss something? |
Co-Authored-By: Peter Andreas Entschev <[email protected]>
049bba4
to
38c04b1
Compare
To better align with how other libraries handle streams, skip the synchronization step. This should happen anyways for the default stream (if the user doesn't specify one). Also should give users who want to manage their concurrency more control.
Add a `to_device` function that the static method uses.
copy_to_host
functioncopy_to_host
and to_device
functions
copy_to_host
and to_device
functionscopy_to_host
and to_device
Pushed some fixes. Please take a look and let me know your thoughts 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last change and the implementation looks good. Just needs some docstrings and this is ready to go.
Also make sure we return the exact same input provided to us (as opposed to a Cython memoryview).
Added docstrings, merged with upstream, and did a little cleanup. Tests passed for me locally. Please let me know if there is anything else needed. Thanks all for the reviews! 😄 |
rerun tests |
Exposes Cython/Python functions for
copy_to_host
andto_device
for copying from device to host taking only a pointer and abytes
-like` buffer. This should help users that have at least a device pointer and the number of bytes of their object to copy it over to host. Hopefully this will allow code constructing temporary device arrays to be removed. As a result this should cutdown on the performance penalty incurred by this construction.Also adds a
copy_to_host
method toDeviceBuffer
to match what Numba does withDeviceNDArray
s. Further renamesfrombytes
toto_device
to match Numba. Should make it easier for Python users to leverage this functionality through duck-typing in Python. Thus avoiding handlingDeviceBuffer
andDeviceNDArray
differently.Due to changes in the Cython-level, the
copy_to_host
function at the C++ level became redundant and has been dropped.