-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multi-GPU hang on graph generation #1572
Conversation
…ts integrated to raft)
…rs as a workaround for a NCCL bug
@rlratzel There is a one more bug I am tracking but you can run this on other machines and see how this works (to see the current unresolved issue is DGX1 specific or not). |
Codecov Report
@@ Coverage Diff @@
## branch-0.20 #1572 +/- ##
==============================================
Coverage ? 59.98%
==============================================
Files ? 77
Lines ? 3369
Branches ? 0
==============================================
Hits ? 2021
Misses ? 1348
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Nothing jumped out, so LGTM.
@gpucibot merge |
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, version 1.16 of Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also makes it more obvious where cugraph depends on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Brad Rees (https://github.com/BradReesWork) - Seunghwa Kang (https://github.com/seunghwak) URL: #2310
Two bug fixes for multi-GPU graph creation.