Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cuda to 11.7.1 for gcc12 #8025

Conversation

aandvalenzuela
Copy link
Contributor

@aandvalenzuela
Copy link
Contributor Author

enable gpu

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @aandvalenzuela (Andrea Valenzuela) for branch IB/CMSSW_12_5_X/g12.

@cmsbuild, @smuzaffar, @aandvalenzuela, @iarspider can you please review it and eventually sign? Thanks.
@perrotta, @dpiparo, @qliphy, @rappoccio you are the release manager for this.
cms-bot commands are listed here

@aandvalenzuela
Copy link
Contributor Author

aandvalenzuela commented Aug 17, 2022

test parameters:

  • full_cmssw = true

@aandvalenzuela
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fe996c/26871/summary.html
COMMIT: 2dbe72a
CMSSW: CMSSW_12_5_X_2022-08-15-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8025/26871/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

+ chmod -Rf a+rX,u+w,g-w,o-w .
+ '[' 11.7 '!=' 11.5 ']'
+ echo 'Incompatible CUDA version in cudnn.spec!'
Incompatible CUDA version in cudnn.spec!
+ exit 1
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.kAS1km (%prep)


RPM build errors:
line 36: It's not recommended to have unversioned Obsoletes: Obsoletes: external+cudnn+8.3.3.40-cf861dcbac75797183058e767dd7f28a
Macro expanded in comment on line 331: %{pkginstroot}/lib64


@cmsbuild
Copy link
Contributor

Pull request #8025 was updated.

@aandvalenzuela
Copy link
Contributor Author

enable gpu

@aandvalenzuela
Copy link
Contributor Author

test parameters:

  • full_cmssw = true

@aandvalenzuela
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fe996c/26881/summary.html
COMMIT: 8650f6b
CMSSW: CMSSW_12_5_X_2022-08-15-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8025/26881/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation warning when building: See details on the summary page.

@aandvalenzuela
Copy link
Contributor Author

aandvalenzuela commented Aug 17, 2022

Although updating cudnn to 8.4.1.50 for cuda 11.7 (https://developer.nvidia.com/rdp/cudnn-archive), the correct url as 11.6 as default cuda version (https://developer.nvidia.com/compute/cudnn/secure/8.4.1/local_installers/11.6/cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz).

@cmsbuild
Copy link
Contributor

Pull request #8025 was updated.

@aandvalenzuela
Copy link
Contributor Author

enable gpu

@aandvalenzuela
Copy link
Contributor Author

test parameters:

  • full_cmssw = true

@aandvalenzuela
Copy link
Contributor Author

please test


Source: https://developer.download.nvidia.com/compute/redist/cudnn/v%{cudnnver_maj}/local_installers/%{cudaver}/%{archive}.tar.xz
Source: https://developer.download.nvidia.com/compute/redist/cudnn/v%{cudnnver_maj}/local_installers/%{default_cudaver}/%{archive}.tar.xz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, so cudnn is not yet available for cuda 11.7

@cmsbuild
Copy link
Contributor

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fe996c/26887/summary.html
COMMIT: f42acb1
CMSSW: CMSSW_12_5_X_2022-08-15-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8025/26887/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fe996c/26887/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-fe996c/26887/git-merge-result

Build

I found compilation error when building:

>> Compiling alpaka/serial /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_5_X_2022-08-15-1100/src/HeterogeneousCore/AlpakaTest/plugins/alpaka/TestAlgo.dev.cc
>> Compiling alpaka/serial edm plugin /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_5_X_2022-08-15-1100/src/HeterogeneousCore/AlpakaTest/plugins/alpaka/TestAlpakaProducer.cc
>> Compiling alpaka/serial edm plugin /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_5_X_2022-08-15-1100/src/HeterogeneousCore/AlpakaTest/plugins/alpaka/TestAlpakaTranscriber.cc
In file included from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/11.7.1-bca409a491b5acecabc9e23fe23d7aa0/include/cuda_runtime.h:83,
                 from :
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/11.7.1-bca409a491b5acecabc9e23fe23d7aa0/include/crt/host_config.h:132:2: error: #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
  132 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
      |  ^~~~~
gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousCore/AlpakaTest/plugins/HeterogeneousCoreAlpakaTestPluginsPortableSerialSync/alpaka/TestAlgo.dev.cc.o] Error 1
>> Cuda Device Link tmp/el8_amd64_gcc12/src/HeterogeneousCore/AlpakaTest/plugins/HeterogeneousCoreAlpakaTestPluginsPortableSerialSync/HeterogeneousCoreAlpakaTestPluginsPortableSerialSync_cudadlink.o 
nvlink fatal   : Could not open input file 'tmp/el8_amd64_gcc12/src/HeterogeneousCore/AlpakaTest/plugins/HeterogeneousCoreAlpakaTestPluginsPortableSerialSync/alpaka/TestAlgo.dev.cc.o' (target: sm_60)


@fwyzard
Copy link
Contributor

fwyzard commented Aug 17, 2022

See https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

CUDA 11.7 Update 1 adds official support for RHEL 9, but it does not support GCC 12, only GCC 11.

See https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements .

@fwyzard
Copy link
Contributor

fwyzard commented Aug 17, 2022

As a side note, could you please tag @cms-sw/heterogeneous-l2 when there are changes to externals related to GPUs (cuda, rocm, alpaka, and indirectly eigen) ?

@smuzaffar
Copy link
Contributor

As a side note, could you please tag @cms-sw/heterogeneous-l2 when there are changes to externals related to GPUs (cuda, rocm, alpaka, and indirectly eigen) ?

sure, I will see if bot can do this automatically

@smuzaffar
Copy link
Contributor

See https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

CUDA 11.7 Update 1 adds official support for RHEL 9, but it does not support GCC 12, only GCC 11.

See https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements .

PR was not intended to go in, it was just to see if cuda 11.7 fixes/supports gcc 12

@smuzaffar
Copy link
Contributor

closing this as cuda/gcc12 suppot is still missing. Also cudnn does not yes support cuda 11.7

@smuzaffar smuzaffar closed this Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants