Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Alpaka backends based on the architecture #94

Closed
wants to merge 1 commit into from

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Feb 26, 2023

Enable the Alpaka backends for CUDA and ROCm only on the architectures that support them:

  • CUDA is enabled on all architectures, as long as gcc is < 12
  • ROCm is enabled only on x86

Enable the Alpaka backends for CUDA and ROCm only on the architectures that
support them:
  - CUDA is enabled on all architectures, as long as gcc is < 12
  - ROCm is enabled only on x86
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard (Andrea Bocci) for branch scramv3.

@cmsbuild, @smuzaffar, @aandvalenzuela, @iarspider can you please review it and eventually sign? Thanks.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.
cms-bot commands are listed here

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 26, 2023

What we should do is something like

  <flags ALPAKA_BACKENDS="serial"/>
  <iftool name="cuda-gcc-support">
    <flags ALPAKA_BACKENDS="cuda"/>
  </iftool>
  <iftool name="rocm">
    <flags ALPAKA_BACKENDS="rocm"/>
  </iftool>

But that doesn't seem to work, so the changes in this PR currently implement the same effect, based on the architecture.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 26, 2023

please test with cms-sw/cmssw#40832

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 26, 2023

please test with cms-sw/cmssw#40832 for el8_ppc64le_gcc11

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9ab3d7/30906/summary.html
COMMIT: a803314
CMSSW: CMSSW_13_1_X_2023-02-24-2300/el8_ppc64le_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw-config/94/30906/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9ab3d7/30906/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9ab3d7/30906/git-merge-result

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9ab3d7/30909/summary.html
COMMIT: a803314
CMSSW: CMSSW_13_1_X_2023-02-26-0000/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw-config/94/30909/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 20834.020834.0_TTbar_14TeV+2026D88/step2_TTbar_14TeV+2026D88.log
  • 20834.10320834.103_TTbar_14TeV+2026D88Aging3000/step2_TTbar_14TeV+2026D88Aging3000.log
  • 20834.2120834.21_TTbar_14TeV+2026D88_ProdLike/step2_TTbar_14TeV+2026D88_ProdLike.log
Expand to see more relval errors ...

Comparison Summary

Summary:

  • You potentially added 4 lines to the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3528955
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3528927
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@smuzaffar
Copy link
Contributor

@fwyzard should not https://github.com/cms-sw/cmssw-config/blob/scramv3/SCRAM/Plugins/BuildRules.py#L1017-L1020 enough ? I think Self.xml can declare all backend <flags ALPAKA_BACKENDS="serial rocm cuda"/> and then scram should drop cuda if cuda-gcc-support isnot available and drop rocm backend of rocm is not available

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 27, 2023

Ah, I see... sure, that's enough.

@fwyzard fwyzard closed this Feb 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants