Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEST] Revert "TF: Use consistent abseil version in TensorFlow" #8685

Closed

Conversation

iarspider
Copy link
Contributor

Reverts #8675

Compilation on ARM fails with:

ERROR: /data/cmsbld/jenkins_b/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tensorflow-sources/2.12.0-044978ac9e50a16db08877520775e250/tensorflow-2.12.0/tensorflow/core/kernels/BUILD:4491:18: Compiling tensorflow/core/kernels/depthtospace_op_gpu.cu.cc failed: (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command 
  (cd /data/cmsbld/jenkins_b/workspace/build-any-ib/w/BUILD/el9_aarch64_gcc11/external/tensorflow-sources/2.12.0-044978ac9e50a16db08877520775e250/build/f3ae9d2dafcfe5643f514015f93d74cb/execroot/org_tensorflow && \
  exec env - \
    CUDA_TOOLKIT_PATH=/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/cuda/11.8.0-a2db08b624dc4f754031142a3cdb3a9d \
    GCC_HOST_COMPILER_PATH=/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/gcc \
    LD_LIBRARY_PATH=<...>
    PATH=<...>
        PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/python3/3.9.14-e432d7f95f9e22c05899c3205f44ed54/bin/python3 \
    PYTHON_LIB_PATH=/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/python3/3.9.14-e432d7f95f9e22c05899c3205f44ed54/lib/python3.9/site-packages \
    TF2_BEHAVIOR=1 \
    TF_CUDA_COMPUTE_CAPABILITIES=compute_70,compute_72,compute_75 \
    TF_CUDA_PATHS=/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/cuda/11.8.0-a2db08b624dc4f754031142a3cdb3a9d,/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/cudnn/8.8.0.121-03351b77b26a1bb330695b75933b5212 \
    TF_CUDA_VERSION=11.8 \
    TF_SYSTEM_LIBS=absl_py,astor_archive,boringssl,com_github_grpc_grpc,com_google_protobuf,curl,cython,eigen_archive,flatbuffers,functools32_archive,gast_archive,gif,libjpeg_turbo,opt_einsum_archive,org_python_pypi_backports_weakref,org_sqlite,pasta,png,pybind11,six_archive,termcolor_archive,typing_extensions_archive,wrapt,zlib \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/aarch64-opt/bin/tensorflow/core/kernels/_objs/depth_space_ops_gpu/depthtospace_op_gpu.cu.pic.d '-frandom-seed=bazel-out/aarch64-opt/bin/tensorflow/core/kernels/_objs/depth_space_ops_gpu/depthtospace_op_gpu.cu.pic.o' -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -iquote . -iquote bazel-out/aarch64-opt/bin -iquote external/com_google_absl -iquote bazel-out/aarch64-opt/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/aarch64-opt/bin/external/nsync -iquote external/eigen_archive -iquote bazel-out/aarch64-opt/bin/external/eigen_archive -iquote external/com_google_protobuf -iquote bazel-out/aarch64-opt/bin/external/com_google_protobuf -iquote external/gif -iquote bazel-out/aarch64-opt/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/aarch64-opt/bin/external/libjpeg_turbo -iquote external/com_googlesource_code_re2 -iquote bazel-out/aarch64-opt/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/aarch64-opt/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/aarch64-opt/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/aarch64-opt/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/aarch64-opt/bin/external/zlib -iquote external/local_config_cuda -iquote bazel-out/aarch64-opt/bin/external/local_config_cuda -iquote external/snappy -iquote bazel-out/aarch64-opt/bin/external/snappy -iquote external/double_conversion -iquote bazel-out/aarch64-opt/bin/external/double_conversion -iquote external/local_config_rocm -iquote bazel-out/aarch64-opt/bin/external/local_config_rocm -iquote external/local_config_tensorrt -iquote bazel-out/aarch64-opt/bin/external/local_config_tensorrt -Ibazel-out/aarch64-opt/bin/external/local_config_cuda/cuda/_virtual_includes/cuda_headers_virtual -Ibazel-out/aarch64-opt/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers -isystem external/nsync/public -isystem bazel-out/aarch64-opt/bin/external/nsync/public -isystem external/eigen_archive/include/eigen3 -isystem bazel-out/aarch64-opt/bin/external/eigen_archive/include/eigen3 -isystem external/com_google_protobuf/include -isystem bazel-out/aarch64-opt/bin/external/com_google_protobuf/include -isystem external/gif/include -isystem bazel-out/aarch64-opt/bin/external/gif/include -isystem external/libjpeg_turbo/include -isystem bazel-out/aarch64-opt/bin/external/libjpeg_turbo/include -isystem external/farmhash_archive/src -isystem bazel-out/aarch64-opt/bin/external/farmhash_archive/src -isystem external/zlib/include -isystem bazel-out/aarch64-opt/bin/external/zlib/include -isystem external/local_config_cuda/cuda -isystem bazel-out/aarch64-opt/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/aarch64-opt/bin/external/local_config_cuda/cuda/cuda/include -isystem external/local_config_rocm/rocm -isystem bazel-out/aarch64-opt/bin/external/local_config_rocm/rocm -isystem external/local_config_rocm/rocm/rocm/include -isystem bazel-out/aarch64-opt/bin/external/local_config_rocm/rocm/rocm/include -isystem external/local_config_rocm/rocm/rocm/include/rocrand -isystem bazel-out/aarch64-opt/bin/external/local_config_rocm/rocm/rocm/include/rocrand -isystem external/local_config_rocm/rocm/rocm/include/roctracer -isystem bazel-out/aarch64-opt/bin/external/local_config_rocm/rocm/rocm/include/roctracer -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -Wno-all -Wno-extra -Wno-deprecated -Wno-deprecated-declarations -Wno-ignored-attributes -Wno-array-bounds -Wunused-result '-Werror=unused-result' -Wswitch '-Werror=switch' '-Wno-error=unused-but-set-variable' -DAUTOLOAD_DYNAMIC_KERNELS '-march=armv8-a' -mno-outline-atomics -Wno-sign-compare '-std=c++17' '-std=c++17' -x cuda '-DGOOGLE_CUDA=1' '--cuda-include-ptx=sm_70' '--cuda-gpu-arch=sm_70' '--cuda-include-ptx=sm_72' '--cuda-gpu-arch=sm_72' '--cuda-include-ptx=sm_75' '--cuda-gpu-arch=sm_75' '-Xcuda-fatbinary=--compress-all' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions '-DGOOGLE_CUDA=1' '-DTENSORFLOW_USE_NVCC=1' -pthread '-nvcc_options=relaxed-constexpr' '-nvcc_options=ftz=true' -c tensorflow/core/kernels/depthtospace_op_gpu.cu.cc -o bazel-out/aarch64-opt/bin/tensorflow/core/kernels/_objs/depth_space_ops_gpu/depthtospace_op_gpu.cu.pic.o)
# Configuration: 99ba46c2452aa9a5d5b939673c664313b29f97d3839647b88f01d4b3a7436dae
# Execution platform: @local_execution_config_platform//:platform
external/eigen_archive/include/eigen3/unsupported/Eigen/CXX11/../../../Eigen/src/Core/util/IntegralConstant.h(187): warning #1835-D: attribute "__host__" does not apply here

external/com_google_absl/absl/status/status.h(796): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::FilesExist" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::CreateDir" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/env.h(498): warning #611-D: overloaded virtual function "tsl::Env::RegisterFileSystem" is only partially overridden in class "tsl::EnvWrapper"

external/eigen_archive/include/eigen3/unsupported/Eigen/CXX11/../../../Eigen/src/Core/util/IntegralConstant.h(187): warning #1835-D: attribute "__host__" does not apply here

external/com_google_absl/absl/status/status.h(796): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::FilesExist" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::CreateDir" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/env.h(498): warning #611-D: overloaded virtual function "tsl::Env::RegisterFileSystem" is only partially overridden in class "tsl::EnvWrapper"

external/eigen_archive/include/eigen3/unsupported/Eigen/CXX11/../../../Eigen/src/Core/util/IntegralConstant.h(187): warning #1835-D: attribute "__host__" does not apply here

external/com_google_absl/absl/status/status.h(796): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::FilesExist" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::CreateDir" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/env.h(498): warning #611-D: overloaded virtual function "tsl::Env::RegisterFileSystem" is only partially overridden in class "tsl::EnvWrapper"

external/eigen_archive/include/eigen3/unsupported/Eigen/CXX11/../../../Eigen/src/Core/util/IntegralConstant.h(187): warning #1835-D: attribute "__host__" does not apply here

external/com_google_absl/absl/status/status.h(796): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/errors.h(143): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/statusor_internals.h(142): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/statusor_internals.h(152): warning #2810-D: ignoring return value type with "nodiscard" attribute

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::FilesExist" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/file_system.h(574): warning #611-D: overloaded virtual function "tsl::FileSystem::CreateDir" is only partially overridden in class "tsl::WrappedFileSystem"

./tensorflow/tsl/platform/env.h(498): warning #611-D: overloaded virtual function "tsl::Env::RegisterFileSystem" is only partially overridden in class "tsl::EnvWrapper"

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(38): error: identifier "__Int8x8_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(39): error: identifier "__Int16x4_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(40): error: identifier "__Int32x2_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(41): error: identifier "__Int64x1_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(42): error: identifier "__Float16x4_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(43): error: identifier "__Float32x2_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(44): error: identifier "__Poly8x8_t" is undefined

/data/cmsbld/jenkins_b/workspace/build-any-ib/w/el9_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(45): error: identifier "__Poly16x4_t" is undefined

<... for every NEON intrinsic ...>

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2023

A new Pull Request was created by @iarspider for branch IB/CMSSW_13_3_X/master.

@cmsbuild, @smuzaffar, @aandvalenzuela, @iarspider can you please review it and eventually sign? Thanks.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@iarspider
Copy link
Contributor Author

please test for el8_aarch64_gcc11

@smuzaffar
Copy link
Contributor

why not just apply the third_party/absl/com_google_absl_fix_mac_and_nvcc_build.patch patch?

@iarspider
Copy link
Contributor Author

why not just apply the third_party/absl/com_google_absl_fix_mac_and_nvcc_build.patch patch?

I am not sure if it will apply as it is (the issue is supposedly fixed).

@smuzaffar
Copy link
Contributor

may be not the complete patch but absl/base/config.h contains some fixes for aarch (may be we need that)

 #ifdef ABSL_INTERNAL_HAVE_ARM_NEON
 #error ABSL_INTERNAL_HAVE_ARM_NEON cannot be directly set
-#elif defined(__ARM_NEON)
+#elif defined(__ARM_NEON) && !defined(__CUDACC__)
 #define ABSL_INTERNAL_HAVE_ARM_NEON 1
 #endif

@smuzaffar
Copy link
Contributor

@smuzaffar
Copy link
Contributor

cms-externals/tensorflow#12 was done to make sure we use consistent version of absl. If reverting cms-externals/tensorflow#12 works for aarch then try update cmsdst/absl to use the same commit as used by TF

@iarspider
Copy link
Contributor Author

Will do.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2023

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34636/summary.html
COMMIT: 73d3fea
CMSSW: CMSSW_13_3_X_2023-08-31-2300/el8_aarch64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8685/34636/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34636/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34636/git-merge-result

Build

I found compilation error when building:

>> Building edm plugin tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/libRecoMETMETPUSubtraction_plugins.so
lto-wrapper: warning: using serial compilation of 4 LTRANS jobs
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/../../../../aarch64-redhat-linux-gnu/bin/ld.bfd: tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/cc4qAELI.ltrans0.ltrans.o: in function `DeepMETProducer::DeepMETProducer(edm::ParameterSet const&, tensorflow::SessionCache const*)':
:(.text+0x75d8): undefined reference to `tensorflow::TensorShapeBase::TensorShapeBase(absl::lts_20230125::Span)'
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/../../../../aarch64-redhat-linux-gnu/bin/ld.bfd: :(.text+0x75fc): undefined reference to `tensorflow::TensorShapeBase::TensorShapeBase(absl::lts_20230125::Span)'
collect2: error: ld returned 1 exit status
gmake: *** [tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/libRecoMETMETPUSubtraction_plugins.so] Error 1
Leaving library rule at src/RecoMET/METPUSubtraction/plugins
Entering library rule at RecoMET/METPUSubtraction
>> Compiling  /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/CMSSW_13_3_X_2023-08-31-2300/src/RecoMET/METPUSubtraction/src/DeepMETHelper.cc
>> Compiling  /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/CMSSW_13_3_X_2023-08-31-2300/src/RecoMET/METPUSubtraction/src/MvaMEtUtilities.cc


@smuzaffar
Copy link
Contributor

please test for el8_aarch64_gcc11

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2023

Pull request #8685 was updated.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 6, 2023

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34637/summary.html
COMMIT: c6c0249
CMSSW: CMSSW_13_3_X_2023-08-31-2300/el8_aarch64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8685/34637/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34637/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34637/git-merge-result

Build

I found compilation error when building:

>> Building edm plugin tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/libRecoMETMETPUSubtraction_plugins.so
lto-wrapper: warning: using serial compilation of 4 LTRANS jobs
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/../../../../aarch64-redhat-linux-gnu/bin/ld.bfd: tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/ccKVMYdx.ltrans0.ltrans.o: in function `DeepMETProducer::DeepMETProducer(edm::ParameterSet const&, tensorflow::SessionCache const*)':
:(.text+0x75d8): undefined reference to `tensorflow::TensorShapeBase::TensorShapeBase(absl::lts_20230125::Span)'
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/../../../../aarch64-redhat-linux-gnu/bin/ld.bfd: :(.text+0x75fc): undefined reference to `tensorflow::TensorShapeBase::TensorShapeBase(absl::lts_20230125::Span)'
collect2: error: ld returned 1 exit status
gmake: *** [tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/libRecoMETMETPUSubtraction_plugins.so] Error 1
Leaving library rule at src/RecoMET/METPUSubtraction/plugins
Entering library rule at RecoMET/METPUSubtraction
>> Compiling  /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/CMSSW_13_3_X_2023-08-31-2300/src/RecoMET/METPUSubtraction/src/DeepMETHelper.cc
>> Compiling  /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/CMSSW_13_3_X_2023-08-31-2300/src/RecoMET/METPUSubtraction/src/MvaMEtUtilities.cc


@iarspider
Copy link
Contributor Author

Tensorflow build was successful, but cmssw failed to build (because both absl version and workaround were reverted).

@smuzaffar
Copy link
Contributor

why have not we tried #8685 (comment) yet ?

@iarspider
Copy link
Contributor Author

Working on it.

@smuzaffar
Copy link
Contributor

Working on it.

are there any blocker for this? this should not have taken 2 days

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 8, 2023

Pull request #8685 was updated.

@iarspider
Copy link
Contributor Author

please test for el8_aarch64_gcc11

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 8, 2023

-1

Failed Tests: Build
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34658/summary.html
COMMIT: 828d60f
CMSSW: CMSSW_13_3_X_2023-08-31-2300/el8_aarch64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8685/34658/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34658/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34658/git-merge-result

Build

I found compilation error when building:

>> Building edm plugin tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/libRecoMETMETPUSubtraction_plugins.so
lto-wrapper: warning: using serial compilation of 4 LTRANS jobs
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/../../../../aarch64-redhat-linux-gnu/bin/ld.bfd: tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/ccukJAuz.ltrans0.ltrans.o: in function `DeepMETProducer::DeepMETProducer(edm::ParameterSet const&, tensorflow::SessionCache const*)':
:(.text+0x75d8): undefined reference to `tensorflow::TensorShapeBase::TensorShapeBase(absl::lts_20230125::Span)'
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/../../../../aarch64-redhat-linux-gnu/bin/ld.bfd: :(.text+0x75fc): undefined reference to `tensorflow::TensorShapeBase::TensorShapeBase(absl::lts_20230125::Span)'
collect2: error: ld returned 1 exit status
gmake: *** [tmp/el8_aarch64_gcc11/src/RecoMET/METPUSubtraction/plugins/RecoMETMETPUSubtraction_plugins/libRecoMETMETPUSubtraction_plugins.so] Error 1
Leaving library rule at src/RecoMET/METPUSubtraction/plugins
Entering library rule at RecoMET/METPUSubtraction
>> Compiling  /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/CMSSW_13_3_X_2023-08-31-2300/src/RecoMET/METPUSubtraction/src/DeepMETHelper.cc
>> Compiling  /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/CMSSW_13_3_X_2023-08-31-2300/src/RecoMET/METPUSubtraction/src/MvaMEtUtilities.cc


@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 8, 2023

Pull request #8685 was updated.

@iarspider
Copy link
Contributor Author

please test for el8_aarch64_gcc11

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 8, 2023

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4443f2/34666/summary.html
COMMIT: 9470170
CMSSW: CMSSW_13_3_X_2023-08-31-2300/el8_aarch64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/8685/34666/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

Requested to quit.
Requested to quit.
Requested to quit.
Requested to quit.
* The action "build-external+tensorflow-sources+2.12.0-d6fa584e76259faf9b5b73c14464a668" was not completed successfully because Failed to build tensorflow-sources. Log file in /data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_aarch64_gcc11/external/tensorflow-sources/2.12.0-d6fa584e76259faf9b5b73c14464a668/log. Final lines of the log file:
/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(1292): error: identifier "__builtin_aarch64_addhn2v2di" is undefined

/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(1299): error: identifier "__builtin_aarch64_addhn2v8hi" is undefined

/data/cmsbld/jenkins_a/workspace/ib-run-pr-tests/testBuildDir/el8_aarch64_gcc11/external/gcc/11.4.1-30ebdc301ebd200f2ae0e3d880258e65/bin/../lib/gcc/aarch64-redhat-linux-gnu/11.4.1/include/arm_neon.h(1308): error: identifier "__builtin_aarch64_addhn2v4si" is undefined



@smuzaffar
Copy link
Contributor

closing in favor of #8692

@smuzaffar smuzaffar closed this Sep 11, 2023
@smuzaffar smuzaffar deleted the revert-8675-smuzaffar-patch-8 branch September 11, 2023 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants