Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add frombytes to convert bytes-like to DeviceBuffer #253

Merged
merged 14 commits into from
Jan 27, 2020
Merged
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## New Features

- PR #253 Add `frombytes` to convert `bytes`-like to `DeviceBuffer`
- PR #252 Add `__sizeof__` method to `DeviceBuffer`

## Improvements
Expand Down
3 changes: 3 additions & 0 deletions python/rmm/_lib/device_buffer.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ cdef class DeviceBuffer:
@staticmethod
cdef DeviceBuffer c_from_unique_ptr(unique_ptr[device_buffer] ptr)

@staticmethod
cdef DeviceBuffer c_frombytes(const unsigned char[::1] b,
uintptr_t stream=*)
cpdef bytes tobytes(self, uintptr_t stream=*)

cdef size_t c_size(self)
Expand Down
23 changes: 22 additions & 1 deletion python/rmm/_lib/device_buffer.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ from cpython.bytes cimport PyBytes_FromStringAndSize, PyBytes_AS_STRING
from rmm._lib.lib cimport (cudaError_t, cudaSuccess,
cudaStream_t, cudaStreamSynchronize)

cimport cython


cdef class DeviceBuffer:

Expand Down Expand Up @@ -79,17 +81,36 @@ cdef class DeviceBuffer:
buf.c_obj = move(ptr)
return buf

@staticmethod
@cython.boundscheck(False)
cdef DeviceBuffer c_frombytes(const unsigned char[::1] b,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this handle memoryview objects in general? If so it would be nice if we could have an overloaded non-const version to handle things like bytearray or other non-readonly objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It handles memoryview objects as long as they are 1-D, contiguous, and of type unsigned char.

The const just means that we won't modify the underlying buffer (much like if we took const char* and were given char*). So any mutable object meeting the constraints above could be passed in as well. This just promises we won't modify the underlying data. IOW bytearray is permissible and we test for this below.

As to other memory types and layouts that don't meet these requirements, users could coerce them into something that meets these requirements (with a minimal amount of copying to ensure contiguous data). Decided not to do that for the user as I wanted them to know we are not preserving the type or layout of their data. Though they could take the resulting DeviceBuffer and layer this information back on top by constructing some other array (like CuPy or Numba) and adding this info to that object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with not implicitly making things contiguous at this level. I'm surprised that a bytearray is allowed for a const unsigned char[::1]. We previously ran into issues with using const vs non-const in handling numpy arrays that were marked read-only which is what prompted me to questioning an overload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool.

Yeah I recall some cases where Cython didn't work as well with const memoryviews. Issue ( cython/cython#1772 ) comes to mind (though that's with fused types). Not sure if there were issues with const memoryviews on a specific type. Was this what you were thinking about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: After discussion offline, we decided to add one more test using a read-only NumPy array, which has been included in commit ( 2234e5a ).

uintptr_t stream=0):
kkraus14 marked this conversation as resolved.
Show resolved Hide resolved
if b is None:
raise TypeError(
"Argument 'b' has incorrect type"
" (expected bytes, got NoneType)"
)

cdef uintptr_t p = <uintptr_t>&b[0]
cdef size_t s = len(b)
return DeviceBuffer(ptr=p, size=s, stream=stream)

@staticmethod
def frombytes(const unsigned char[::1] b, uintptr_t stream=0):
return DeviceBuffer.c_frombytes(b, stream)

cpdef bytes tobytes(self, uintptr_t stream=0):
cdef const device_buffer* dbp = self.c_obj.get()
cdef size_t s = dbp.size()
if s == 0:
return b""

cdef cudaStream_t c_stream = <cudaStream_t>stream
cdef bytes b = PyBytes_FromStringAndSize(NULL, s)
cdef void* p = <void*>PyBytes_AS_STRING(b)
cdef cudaStream_t c_stream
cdef cudaError_t err
with nogil:
c_stream = <cudaStream_t>stream
copy_to_host(dbp[0], p, c_stream)
err = cudaStreamSynchronize(c_stream)
if err != cudaSuccess:
Expand Down
45 changes: 45 additions & 0 deletions python/rmm/tests/test_rmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,12 +112,57 @@ def test_rmm_device_buffer(size):
assert isinstance(s, bytes)
assert len(s) == len(b)

# Test conversion from bytes
b2 = rmm.DeviceBuffer.frombytes(s)
assert isinstance(b2, rmm.DeviceBuffer)
assert len(b2) == len(s)

# Test resizing
b.resize(2)
assert b.size == 2
assert b.capacity() >= b.size


@pytest.mark.parametrize(
"hb",
[
None,
"abc",
123,
b"",
np.ones((2,), "u2"),
np.ones((2, 2), "u1"),
np.ones(4, "u1")[::2],
b"abc",
bytearray(b"abc"),
memoryview(b"abc"),
np.asarray(memoryview(b"abc")),
np.arange(3, dtype="u1"),
],
)
def test_rmm_device_buffer_bytes_roundtrip(hb):
try:
mv = memoryview(hb)
except TypeError:
with pytest.raises(TypeError):
rmm.DeviceBuffer.frombytes(hb)
else:
if mv.format != "B":
with pytest.raises(ValueError):
rmm.DeviceBuffer.frombytes(hb)
elif len(mv.strides) != 1:
with pytest.raises(ValueError):
rmm.DeviceBuffer.frombytes(hb)
elif mv.strides[0] != 1:
with pytest.raises(ValueError):
rmm.DeviceBuffer.frombytes(hb)
else:
db = rmm.DeviceBuffer.frombytes(hb)
hb2 = db.tobytes()
mv2 = memoryview(hb2)
assert mv == mv2


def test_rmm_cupy_allocator():
cupy = pytest.importorskip("cupy")

Expand Down