-
Notifications
You must be signed in to change notification settings - Fork 22.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BE]: Update cudnn to 8.9.7.29 #120642
[BE]: Update cudnn to 8.9.7.29 #120642
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120642
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 4 Unrelated FailuresAs of commit ece0cce with merge base 8bf9e99 (): NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
809a556
to
1b638fb
Compare
@pytorchbot merge |
Merge failedReason: Approvers from one of the following sets are needed:
|
@pytorchbot merge |
1 similar comment
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 3 jobs have failed, first few of them are: trunk, linux-binary-manywheel, linux-binary-libtorch-cxx11-abi Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Ok, the sad thing is: there are no nvidia-cudnn-cu11==8.9.7.29 package on pypi , only 8.9.6 at the time of the writing: |
Let me ping cuDNN about it as 8.9.7 is among the more well-tested released and should be hosted at this point |
Yes, @malfet is correct and Both were opened in ~Dec. 2023 and we haven't received any updates yet. Unsure if @rgommers would know more |
Not much more. There is an active discussion on Discourse about issues with responses to PyPI support requests in general. I just commented a few days ago on that thread about issues with limit size requests - no response or activity yet. I'll note that the situation for JAX that I linked to in that comment seems even worse; they have been deleting their old releases for a while now to make space for new releases (which of course broke some users who pinned those old versions). For PyTorch itself I think we've avoided hard release blockers so far (correct me if I'm wrong @malfet), but it really isn't a good situation that important projects like cuDNN cannot upload releases. The canonical discussion on this is "What to do about GPUs? (and the built distributions that support them)"; that never really got resolved. I am considering writing a larger new post about it later this week, depending on the urgency, but I really don't know if it'll help given how politicized that issue has become. There aren't many PyPI admins, and most are volunteers. And the one active PSF staff member doesn't have much time allocated to this; it looks like they're trying to get a new paid support engineer funded by the PSF, but that may take a while to materialize. |
It would be nice to update to the latest CUDNN that we can support for 2.3. What's the latest version that is feasible to support? |
I can update our index to 8.9.7.29 from NVIDIA's pypi that should unblock both this change and the release |
@@ -5,11 +5,11 @@ if [[ ${CUDNN_VERSION} == 8 ]]; then | |||
mkdir tmp_cudnn | |||
pushd tmp_cudnn | |||
if [[ ${CUDA_VERSION:0:4} == "12.1" ]]; then | |||
CUDNN_NAME="cudnn-linux-x86_64-8.9.2.26_cuda12-archive" | |||
CUDNN_NAME="cudnn-linux-x86_64-8.9.7.29_cuda12-archive" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update only 11.8 cudnn version. At this point we want to upgrade only cuda 11.8 cudnn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atalman why do we not want to update cuDNN for the CUDA 12 builds? This would cause a divergence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ptrblck Sorry yes, issue exist in cudnn 11 but not 12, my bad https://pypi.org/project/nvidia-cudnn-cu12/#files. I do see cudnn 8.9.7.29
available.
878e885
to
b132d79
Compare
@Skylion007 @malfet Here are additional work required for cudnn : https://github.com/pytorch/builder/blob/main/CUDA_UPGRADE_GUIDE.MD#upgrade-cudnn-version-only |
* Windows CUDA 12.4 changes Refrence: #1376 * Update cudnn to 8.9.7.29 to align with pytorch/pytorch#120642
@pytorchmergebot rebase -b main |
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
Successfully rebased |
b132d79
to
dfae8ee
Compare
Closing in favor of #123475 |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
dfae8ee
to
ece0cce
Compare
Closing as v9 is in :) |
Update cudnn to 8.9.7.29 . We just updated the cudnn frontend, might as well. Mostly has improvements for the cudnn flash attention implementation which we are interested in exploring. Such as in #115663