-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python bindings for cuda_async_memory_resource
#718
Conversation
cdef class CudaAsyncMemoryResource(DeviceMemoryResource): | ||
def __cinit__(self, device=None): | ||
self.c_obj.reset( | ||
new cuda_async_memory_resource() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should failure look like here?
- Should we just let the C++ error propagate up and expose that directly?
- Do we want to wrap this call in a
try..except
and re-raise with more information? - Do we want to call
driverGetVersion()
and duplicate the check for 11.2 in C++ and Python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the C++ error look like if someone tries to create this on CUDA 11.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to improve the message here: https://github.com/rapidsai/rmm/blob/branch-0.19/include/rmm/mr/device/cuda_async_memory_resource.hpp#L53 to say that it was compiled without support instead of just the generic error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this part of the macro deals specifically with CUDA version < 11.2 -- @harrism any thoughts here on a possibly more informative error message? This will directly be propagated up to Python users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cudaMallocAsync not supported by the version of the CUDA Toolkit used for compilation"? I don't want to say "... used to compile RMM" since RMM is header-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved the error message based on your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You changed the wrong error message. :)
Co-authored-by: Keith Kraus <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ashwin! 😄 Had a couple of questions below 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks Ashwin! 😄
python/rmm/tests/test_rmm.py
Outdated
|
||
|
||
@pytest.mark.skipif( | ||
rmm._cuda.gpu.runtimeGetVersion() < 11020, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think technically we need to check both the runtime and driver version here. Someone could use a newer runtime with an older driver for example. Where the call would exist but would error at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused but happy to make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudaMallocAsync depends on having both libcudart >= 11.2 and libcuda >= 11.2. If say you have libcudart == 11.2 and libcuda == 11.0, then https://github.com/rapidsai/rmm/blob/branch-0.19/include/rmm/mr/device/cuda_async_memory_resource.hpp#L49 would error at runtime that there isn't a new enough driver for the feature. If you had libcudart == 11.0 and libcuda == 11.2, then https://github.com/rapidsai/rmm/blob/branch-0.19/include/rmm/mr/device/cuda_async_memory_resource.hpp#L49 would error with an invalid DeviceAttribute since it doesn't exist in libcudart 11.0.
cdef class CudaAsyncMemoryResource(DeviceMemoryResource): | ||
def __cinit__(self, device=None): | ||
self.c_obj.reset( | ||
new cuda_async_memory_resource() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You changed the wrong error message. :)
Co-authored-by: Mark Harris <[email protected]>
rerun tests |
@gpucibot merge |
rerun tests |
for pxd_basename in files_to_preprocess: | ||
pxi_basename = os.path.splitext(pxd_basename)[0] + ".pxi" | ||
if CUDA_VERSION in cuda_version_to_pxi_dir: | ||
pxi_pathname = os.path.join( | ||
cwd, | ||
"rmm/_cuda", | ||
cuda_version_to_pxi_dir[CUDA_VERSION], | ||
pxi_basename, | ||
) | ||
pxd_pathname = os.path.join(cwd, "rmm/_cuda", pxd_basename) | ||
try: | ||
if filecmp.cmp(pxi_pathname, pxd_pathname): | ||
# files are the same, no need to copy | ||
continue | ||
except FileNotFoundError: | ||
# pxd_pathname doesn't exist yet | ||
pass | ||
shutil.copyfile(pxi_pathname, pxd_pathname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the cuda version check outside of the loop and invert it to reduce nesting?
if CUDA_VERSION not in cuda_version_to_pxi_dir:
raise TypeError(f"{CUDA_VERSION} is not supported.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean we always check, regardless of how many files we have to preprocess, so that might need to be accounted for. example: if len(files_to_preprocess) and CUDA_VERSION not in cuda_version_to_pxi_dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that this is low hanging fruit to fix and we may as well tackle it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, this merged before fixing this. Will raise an issue to tackle it in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry. I'll put in one tomorrow.
rerun tests |
1 similar comment
rerun tests |
Closes #701.