-
Notifications
You must be signed in to change notification settings - Fork 924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Ensure that streams passed to libcudf APIs are properly forwarded through #11943
Labels
2 - In Progress
Currently a work in progress
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Milestone
Comments
vyasr
added
feature request
New feature or request
1 - On Deck
To be worked on next
labels
Oct 18, 2022
3 tasks
3 tasks
vyasr
added
2 - In Progress
Currently a work in progress
and removed
1 - On Deck
To be worked on next
labels
Mar 1, 2023
rapids-bot bot
pushed a commit
that referenced
this issue
Mar 14, 2023
This PR builds on #11875 and partially addresses #11943. This PR allows us to run all tests on a precise stream (the newly introduced `cudf::test::get_default_stream()`) and then verify that all CUDA APIs end up invoked on that stream. This implements the feature required in #11943, but to apply it universally across libcudf will require the API changes that will expose streams so I plan to make those changes incrementally after this PR is merged. The preload library is now compiled twice, once to overload `cudf::get_default_stream` and once to overload `cudf::test::get_default_stream`. For now there is still some manual coordination associated with determining which one should be used with a given test, but once #12451 is merged and we start running all tests via ctest instead of direct invocation of the test executables we can start encoding this information in the CMake configuration of the tests by associating the require environment variables directly with the test executable using `set_tests_properties`. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Ray Douglass (https://github.com/raydouglass) - Nghia Truong (https://github.com/ttnghia) URL: #12089
rapids-bot bot
pushed a commit
that referenced
this issue
Jun 5, 2023
For the purpose of verifying that streams are properly forwarded through all libcudf APIs, libcudf tests will be rewritten to use `cudf::test::get_default_stream()` (introduced in #12089) instead of `cudf::get_default_stream()`. By default, these are identical, so this change is typically a no-op, but when using the preload library features added in #12089 we will be able to use a custom (non CUDA-default) stream in tests and verify that it is the only stream used. This PR contains a subset of changes needed to existing test functionality without making any changes to libcudf public APIs. These changes are extracted from #12090. Contributes to #11943. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Mark Harris (https://github.com/harrism) URL: #13506
I think we can close this now. The proposed approach has been implemented on the main branch of libcudf and #13629 demonstrates its usage across all of the copying APIs. We've decided not to use a feature branch but to instead modify libcudf APIs over time, so I'll open a new PR to describe ongoing work with APIs. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2 - In Progress
Currently a work in progress
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Is your feature request related to a problem? Please describe.
libcudf currently runs entirely on CUDA's default stream. Internally, this is managed using an
rmm::cuda_stream_view
stored as a global variable used everywhere. We would like to move towards enabling stream-ordered libcudf APIs. In order to enable this safely, we need to ensure that any API that accepts a stream correctly forwards this stream down rather than passingcudf::default_stream_value
through (or anything else likermm::cuda_stream_default
.Describe the solution you'd like
I would like to create and maintain a new feature branch of libcudf. On this branch, we can converting public libcudf APIs to take stream arguments a bit at a time. We should leverage the library written for #11942 to verify that the new streams are correctly being passed all the way down the call stack. To do so, requires the following:
cudf::test::get_default_stream
that returnscudf::get_default_stream
and is used everywhere in the tests where a stream argument is required.cudf::test::get_default_stream
inidentify_stream_usage
that, instead of returningcudf::get_default_stream
, will return a different stream (can be easily managed using a function-local static).identify_stream_usage
so that instead of throwing an error when the default stream is observed, it throws an error whenever anything other thancudf::test::get_default_stream()
is observed. This feature should be set as a compile-time flag so that we can compileidentify_stream_usage
twice, once where it expects to not see any of CUDA's default streams and once where it expects to only seecudf::test::get_default_stream()
.Once these steps are completed, we could work towards exposing streams by transitioning APIs one at a time on a feature branch. This approach requires maintaining an ongoing feature branch, but it saves us the need to do multiple rounds of converting tests to use detail functions and then back. It also allows us to accomplish even more precisely what #11942 intends since we can directly check for the usage of the desired stream rather than the current roundabout approach of checking for the default stream.
Describe alternatives you've considered
I considered trying to find some way to overload all cudf API calls in the tests to instead use some sort of patched version. The biggest issue here (aside from the technical challenges of finding a way to do it) is the fact that not every single cudf API actually accepts a stream, so the logic would need to be sophisticated enough to check the corresponding APIs. I do not think that level of complexity is desirable or warranted when the above solution would be fairly easy to maintain going forward.
Additional context
Addressing #9854 is closely related. Removing all default stream parameters from detail APIs (except for the detail APIs that will eventually become public APIs) will help prevent this kind of problem.
As we move towards more usage of streams, we are likely to write many APIs internally that create additional streams for work.
multibyte_split
is one example that already exists. As such, we may need to extend theidentify_stream_usage
library to include an allowlist and/or a blocklist of streams rather than just searching for a single stream (we may need both to say something like "expect stream X, but only error if detecting stream 0"). Alternatively (or perhaps in addition), we could add adisable_stream_checking
API that can be used on a per-test basis. Once we migrate to running all tests using ctest (rather than running the tests executables directly) we can just rely on configuring each test to skip stream validation if it needs toThe text was updated successfully, but these errors were encountered: