-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix stream usage in segmented_gather()
#9679
Fix stream usage in segmented_gather()
#9679
Conversation
segmented_gather()
segmented_gather()
Apologies. I realize it's late to include this in I can move this to |
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #9679 +/- ##
==============================================
Coverage ? 8.71%
==============================================
Files ? 119
Lines ? 24691
Branches ? 0
==============================================
Hits ? 2151
Misses ? 22540
Partials ? 0 Continue to review full report at Codecov.
|
Thank you for doing this work and finding this issue. |
Using the default stream should result in oversynchronization with non-default streams, not undersynchronization. |
I'll have to keep hunting for the list extraction problem. |
Rebased for |
Rerun tests |
1 similar comment
Rerun tests |
@gpucibot merge |
This change has now been merged. The description was modified to indicate that corrupted output is unlikely. |
fix missing `stream` argument in default argument of functions. And also in some cases, `mr` on returned objects creation. This cleanup is done as a follow up after PR #9679 Almost all of libcudf functions usages of stream arg are cleaned up. Missing `mr` still might need another clean up. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - https://github.com/nvdbaranec URL: #9767
detail::segmented_gather()
inadvertently usescuda_default_stream
in some parts of its implementation, while using the user-specified stream in others.This applies to the calls to
copy_range_in_place()
,allocate_like()
, andmake_lists_column()
.This might produce race conditions, which might explain NVIDIA/spark-rapids/issues/4060. It's a rare failure that's quite hard to reproduce.This might lead to over-synchronization, though bad output is unlikely.The commit here should sort this out, by switching to the
detail
APIs corresponding to the calls above.