You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
libcudf currently runs entirely on CUDA's default stream. Internally, this is managing using an rmm::cuda_stream_view stored as a global variable used everywhere. We would like to move towards enabling stream-ordered libcudf APIs. In order to enable this safely, we need to ensure that libcudf is properly using its default stream everywhere in the codebase.
Describe the solution you'd like
We should write a separate library that overrides existing CUDA functions (via dlsym) to throw an error if they are invoked using CUDA's default stream. We should then preload this library when running libcudf's test suite so that our unit tests can verify that we are properly passing cudf::default_stream_value through to everywhere that it needs to be in the code. We will need to make cudf's default stream configurable so that it can be set to a non-default value when running the tests.
Describe alternatives you've considered
None
The text was updated successfully, but these errors were encountered:
This PR ensures that cudf's default stream is properly passed to all kernel launches so that nothing implicitly runs on the CUDA default stream. It adds a small library that is built during the tests and overloads CUDA functions to throw an exception when usage of the default stream is detected. It also fixes all remaining usage of anything other than cudf's default stream (I fixed most of the issues in previous PRs, but I found a few others when finalizing this one).
Resolves#11929Resolves#11942
### Important notes for reviewers:
- **The changeset is deceptively large.** The vast majority of the changes are just a global find-and-replace of `cudf::get_default_stream()` for `cudf::default_stream_value`, as well as a few smaller fixes such as missing `CUDF_TEST_PROGRAM_MAIN` in a couple of tests and usage of `rmm::cuda_stream_default`. The meaningful changes are:
- The new default stream getter/setter in `default_stream.[hpp|cpp]`
- The addition of `cpp/tests/utilities/identify_stream_usage`
- The changes to the base testing fixture in `cpp/include/cudf_test/base_fixture.hpp` to inject the custom stream.
- The changes to CI in `ci/gpu/build.sh` to build and use the new library.
- This PR is a breaking change because it moves the default stream into the detail namespace. Going forward the default stream may only be accessed using the public accessor `cudf::get_default_stream()`. I have added a corresponding setter, but it is also in the detail namespace since I do not want to publicly support changing the default stream yet, only for the purpose of testing. Reviewers, please leave comments if you disagree with those choices.
- I have made getting and setting the default stream thread-safe, but there is still only a single stream. In multi-threaded applications we may want to support a stream per thread so that users could manually achieve PTDS with more fine-tuned control. Is this worthwhile? Even if it is, I'm inclined to wait for a subsequent PR to implement this unless someone feels strongly otherwise.
- I'm currently only overloading `cudaLaunchKernel`. I can add overloads for other functions as well, but I didn't want to go through the effort of overloading every possible API. If reviewers have a minimal set that they'd like to see overloaded, let me know. [I've included links to all the relevant pages of the CUDA runtime API in the identify_stream_usage.cu file](https://github.com/rapidsai/cudf/pull/11875/files#diff-0b2762207c27c080acd2114475c7a1c06377a7c18c4e9c3de60ecbdc82a4dc61R99) if someone wants to look through them.
Authors:
- Vyas Ramasubramani (https://github.com/vyasr)
Approvers:
- Jason Lowe (https://github.com/jlowe)
- Bradley Dice (https://github.com/bdice)
- Sevag H (https://github.com/sevagh)
- https://github.com/brandon-b-miller
- Jake Hemstad (https://github.com/jrhemstad)
- David Wendt (https://github.com/davidwendt)
URL: #11875
Is your feature request related to a problem? Please describe.
libcudf currently runs entirely on CUDA's default stream. Internally, this is managing using an
rmm::cuda_stream_view
stored as a global variable used everywhere. We would like to move towards enabling stream-ordered libcudf APIs. In order to enable this safely, we need to ensure that libcudf is properly using its default stream everywhere in the codebase.Describe the solution you'd like
We should write a separate library that overrides existing CUDA functions (via dlsym) to throw an error if they are invoked using CUDA's default stream. We should then preload this library when running libcudf's test suite so that our unit tests can verify that we are properly passing
cudf::default_stream_value
through to everywhere that it needs to be in the code. We will need to make cudf's default stream configurable so that it can be set to a non-default value when running the tests.Describe alternatives you've considered
None
The text was updated successfully, but these errors were encountered: