-
Notifications
You must be signed in to change notification settings - Fork 846
Issues: NVIDIA/nccl
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
NCCL Ignores Specified SOCKET_IFNAME Configuration on Worker Nodes in Multi-Node Setup
#1581
opened Jan 18, 2025 by
rachid2198
NCCL_SOCKET_IFNAME has no effect during pytorch distributed training with multiple NICs
#1580
opened Jan 18, 2025 by
hanruijiang
BusBW of 2-node tree-based Allreduce exceeds the theoretical limit
#1576
opened Jan 16, 2025 by
JK-Jiagn
Potential group\collective life-time management issue in profiler plugin.
#1569
opened Jan 9, 2025 by
wiryls
[Hopper/NVLINK4] Origin of failure of fabric manager manifested through NCCL-based codes
#1562
opened Jan 3, 2025 by
vitduck
Broadcast : recvbuff (nil) is not a valid pointerNCCL error
#1558
opened Dec 27, 2024 by
mobilejammer
Is is possible for NCCL to add a retry mechanism when net flap happens
#1557
opened Dec 27, 2024 by
ProHuper
NCCL_GRAPH The INFO of the logo cannot be printed to the console or to a specified file
#1556
opened Dec 26, 2024 by
lmhahatest
How does the frequency setting interface of NVML affect NCCL communication?
#1551
opened Dec 25, 2024 by
lilaiyi
Previous Next
ProTip!
Updated in the last three days: updated:>2025-01-15.