-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to CUDA 11.4.1 #7197
Update to CUDA 11.4.1 #7197
Conversation
Update to CUDA 11.4.1 (11.4.20210728): * CUDA runtime version 11.4.108 * NVIDIA drivers version 470.57.02 Add support for GCC 11 and clang 12. See https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html .
A new Pull Request was created by @fwyzard (Andrea Bocci) for branch IB/CMSSW_12_1_X/master. @cmsbuild, @smuzaffar, @mrodozov, @iarspider can you please review it and eventually sign? Thanks. |
please test |
I expect building CMSSW may fail, so let's check that before testing on other architectures. |
However it might be worth checking a GCC 11 build in parallel. |
please test for CMSSW_12_1_X/slc7_amd64_gcc11 |
-1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17474/summary.html External BuildI found compilation error when building: + for FILE in '$FILES' ++ basename src/common.cpp + /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/cuda/11.4.1-274998c7f34eda4ce17dd26ec4ac9687/bin/nvcc -DALPAKA_ACC_GPU_CUDA_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -DALPAKA_DEBUG=0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/cuda/11.4.1-274998c7f34eda4ce17dd26ec4ac9687/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/tbb/v2021.3.0-13eaf94bcafc2deaec6244d3257cd1bc/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/boost/1.75.0-4f799e0d654b83bad9b3c6c2ddd3197e/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include -Iinclude -std=c++17 -O3 --generate-line-info --source-in-ptx --display-error-number --expt-relaxed-constexpr --extended-lambda -gencode 'arch=compute_60,code=[sm_60,compute_60]' -gencode 'arch=compute_70,code=[sm_70,compute_70]' -gencode 'arch=compute_75,code=[sm_75,compute_75]' -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored --cudart shared -Xcompiler '-std=c++17 -O2 -pthread -fPIC -Wall -Wextra' -x cu -c src/common.cpp -o build/cuda/common.cpp.o /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include/alpaka/event/EventGenericThreads.hpp: In instantiation of 'void alpaka::traits::generic::currentThreadWaitForDevice(const TDev&) [with TDev = alpaka::DevCpu]': /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include/alpaka/dev/cpu/Wait.hpp:33:40: required from here /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/alpaka/0.6.0-238663bf1f8bb79dfcb50509657d997a/include/alpaka/event/EventGenericThreads.hpp:280:20: error: '__T30' was not declared in this scope 280 | auto vQueues(dev.getAllQueues()); | ~~~^~~~~~~~~~~~~~~~~~~~~ error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.v2wR40 (%build) |
-1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17477/summary.html External BuildI found compilation error when building: + for FILE in '$FILES' ++ basename src/common.cpp + /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/cuda/11.4.1-cb1437ae9d2977d557f82280af898abe/bin/nvcc -DALPAKA_ACC_GPU_CUDA_ENABLED -DCUPLA_STREAM_ASYNC_ENABLED=1 -DALPAKA_DEBUG=0 -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/cuda/11.4.1-cb1437ae9d2977d557f82280af898abe/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/tbb/v2021.3.0-1a57fe4de5dfa06c29ac0428de2ef8c3/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/boost/1.75.0-9158d15931ebbfc4a7cd9cab205ad21f/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include -Iinclude -std=c++17 -O3 --generate-line-info --source-in-ptx --display-error-number --expt-relaxed-constexpr --extended-lambda -gencode 'arch=compute_60,code=[sm_60,compute_60]' -gencode 'arch=compute_70,code=[sm_70,compute_70]' -gencode 'arch=compute_75,code=[sm_75,compute_75]' -Wno-deprecated-gpu-targets -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored --cudart shared -Xcompiler '-std=c++17 -O2 -pthread -fPIC -Wall -Wextra' -x cu -c src/common.cpp -o build/cuda/common.cpp.o /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include/alpaka/event/EventGenericThreads.hpp: In instantiation of 'void alpaka::traits::generic::currentThreadWaitForDevice(const TDev&) [with TDev = alpaka::DevCpu]': /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include/alpaka/dev/cpu/Wait.hpp:33:36: required from here /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc11/external/alpaka/0.6.0-13bcccea34afaea478bd71c830f774b2/include/alpaka/event/EventGenericThreads.hpp:280:20: error: '__T30' was not declared in this scope 280 | auto vQueues(dev.getAllQueues()); | ~~~^~~~~~~~~~~~~~~~~~~~~ error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.ZSflaL (%build) |
I forgot, we should retest with Alpaka 0.6.1
|
please test |
-1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9af979/17492/summary.html External BuildI found compilation error when building: + chmod -Rf a+rX,u+w,g-w,o-w . + '[' 11.4 '!=' 11.2 ']' + echo 'Incompatible CUDA version in cudnn.spec!' Incompatible CUDA version in cudnn.spec! + exit 1 error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.x4gVyd (%prep) RPM build errors: Macro %rpmbuild_libdir defined but not used within scope Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.x4gVyd (%prep) |
please test |
Pull request #7197 was updated. |
please test for CMSSW_12_1_X/slc7_amd64_gcc11 |
c238307
to
f3af69f
Compare
please test with #7204 for slc7_aarch64_gcc9 |
please test with #7204 for CMSSW_12_1_X/slc7_pcc64le_gcc9 |
please test with #7204 for CMSSW_12_1_X/slc7_ppc64le_gcc9 |
@smuzaffar , keep in mind that we do not want to merge CUDA 11.4.x in the GCC 9 and GCC 10 builds. |
-1 Failed Tests: Build BuildI found compilation error when building: Entering library rule at CUDADataFormats/StdDictionaries >> Building LCG reflex dict from header file src/CUDADataFormats/StdDictionaries/src/classes.h >> Compiling LCG dictionary: tmp/slc7_amd64_gcc11/src/CUDADataFormats/StdDictionaries/src/CUDADataFormatsStdDictionaries/a/CUDADataFormatsStdDictionaries_xr.cc >> Building shared library tmp/slc7_amd64_gcc11/src/CUDADataFormats/StdDictionaries/src/CUDADataFormatsStdDictionaries/libCUDADataFormatsStdDictionaries.so /cvmfs/cms-ib.cern.ch/nweek-02692/slc7_amd64_gcc11/external/gcc/11.1.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/11.1.1/../../../../x86_64-unknown-linux-gnu/bin/ld: cannot find -lHeterogeneousCoreCUDAUtilities collect2: error: ld returned 1 exit status gmake: *** [tmp/slc7_amd64_gcc11/src/CUDADataFormats/StdDictionaries/src/CUDADataFormatsStdDictionaries/libCUDADataFormatsStdDictionaries.so] Error 1 Leaving library rule at CUDADataFormats/StdDictionaries >> Leaving Package CUDADataFormats/StdDictionaries >> Package CUDADataFormats/StdDictionaries built >> Entering Package CondFormats/DQMObjects |
-1 Failed Tests: Build BuildI found compilation error when building: /scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/Core/Solve.h(72): warning #20011-D: calling a __host__ function("Eigen::PartialPivLU< ::Eigen::Matrix > ::cols() const") from a __host__ __device__ function("Eigen::Solve< ::Eigen::PartialPivLU< ::Eigen::Matrix > , ::Eigen::CwiseNullaryOp< ::Eigen::internal::scalar_identity_op , ::Eigen::Matrix > > ::rows const") is not allowed /scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed /scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): error: identifier "Eigen::fix<(int)-1> " is undefined in device code /scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed /scratch/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_ppc64le_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): error: identifier "Eigen::fix<(int)-1> " is undefined in device code |
-1 Failed Tests: Build BuildI found compilation error when building: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/Core/Solve.h(72): warning #20011-D: calling a __host__ function("Eigen::PartialPivLU< ::Eigen::Matrix > ::cols() const") from a __host__ __device__ function("Eigen::Solve< ::Eigen::PartialPivLU< ::Eigen::Matrix > , ::Eigen::CwiseNullaryOp< ::Eigen::internal::scalar_identity_op , ::Eigen::Matrix > > ::rows const") is not allowed /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): error: identifier "Eigen::fix<(int)-1> " is undefined in device code /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/slc7_amd64_gcc900/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): error: identifier "Eigen::fix<(int)-1> " is undefined in device code |
-1 Failed Tests: Build The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see more details here: BuildI found compilation error when building: /home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/Core/Solve.h(72): warning #20011-D: calling a __host__ function("Eigen::PartialPivLU< ::Eigen::Matrix > ::cols() const") from a __host__ __device__ function("Eigen::Solve< ::Eigen::PartialPivLU< ::Eigen::Matrix > , ::Eigen::CwiseNullaryOp< ::Eigen::internal::scalar_identity_op , ::Eigen::Matrix > > ::rows const") is not allowed /home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed /home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(412): error: identifier "Eigen::fix<(int)-1> " is undefined in device code /home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): warning #20011-D: calling a __host__ function("Eigen::internal::FixedInt<(int)-1> ::operator ()(int) const") from a __host__ __device__ function("Eigen::internal::partial_lu_impl ::unblocked_lu") is not allowed /home/cmsbuild/jenkins_b/workspace/ib-run-pr-tests/testBuildDir/slc7_aarch64_gcc9/external/eigen/f612df273689a19d25b45ca4f8269463207c4fee-db62d720cadbdace8308c32508ea54ae/include/eigen3/Eigen/src/LU/PartialPivLU.h(422): error: identifier "Eigen::fix<(int)-1> " is undefined in device code |
lets gets this in for GCC 11 |
Update to CUDA 11.4.1 (11.4.20210728):
Add support for GCC 11 and clang 12.
See https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html .