Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify torchvision in nv-ds-chat workflow (prevents errors with torch 2.6) #6982

Merged
merged 5 commits into from
Jan 30, 2025

Conversation

loadams
Copy link
Collaborator

@loadams loadams commented Jan 29, 2025

Fixes #6984.

The workflow was pulling the updated torch 2.6, which caused CI failures. This keeps us on torch 2.5 for now, since installing torchvision as a dependency later on was pulling torch 2.6 with it which was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of any errors.

@loadams loadams requested a review from tjruwase January 29, 2025 22:31
@loadams loadams changed the title Pin torch in nv-ds-chat workflow Specify torchvision in nv-ds-chat workflow (prevents errors with torch 2.6) Jan 30, 2025
@loadams loadams enabled auto-merge January 30, 2025 17:23
@loadams loadams added this pull request to the merge queue Jan 30, 2025
Merged via the queue into master with commit c963c21 Jan 30, 2025
11 checks passed
@loadams loadams deleted the loadams/ds-chat-fixes branch January 30, 2025 21:42
tjruwase pushed a commit that referenced this pull request Feb 6, 2025
…h 2.6) (#6982)

Fixes #6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.

Signed-off-by: Olatunji Ruwase <[email protected]>
siqi654321 pushed a commit to siqi654321/DeepSpeed that referenced this pull request Feb 7, 2025
…h 2.6) (deepspeedai#6982)

Fixes deepspeedai#6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.

Signed-off-by: siqi <[email protected]>
traincheck-team pushed a commit to traincheck-team/DeepSpeed that referenced this pull request Feb 9, 2025
…h 2.6) (deepspeedai#6982)

Fixes deepspeedai#6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.
gyou2021 pushed a commit to gyou2021/DeepSpeed that referenced this pull request Feb 18, 2025
…h 2.6) (deepspeedai#6982)

Fixes deepspeedai#6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.

Signed-off-by: gyou2021 <[email protected]>
gyou2021 pushed a commit to gyou2021/DeepSpeed that referenced this pull request Feb 18, 2025
…h 2.6) (deepspeedai#6982)

Fixes deepspeedai#6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.

Signed-off-by: gyou2021 <[email protected]>
gyou2021 pushed a commit to gyou2021/DeepSpeed that referenced this pull request Feb 28, 2025
…h 2.6) (deepspeedai#6982)

Fixes deepspeedai#6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.

Signed-off-by: gyou2021 <[email protected]>
ys950902 pushed a commit to ys950902/DeepSpeed that referenced this pull request Mar 6, 2025
…h 2.6) (deepspeedai#6982)

Fixes deepspeedai#6984.

The workflow was pulling the updated torch 2.6, which caused CI
failures. This keeps us on torch 2.5 for now, since installing
torchvision as a dependency later on was pulling torch 2.6 with it which
was unintended.

This PR also unsets NCCL_DEBUG to avoid a large print out in the case of
any errors.

Signed-off-by: yisheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nv-ds-chat CI test failure
2 participants