Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add GPU-capable workflow 11634.502 to the short matrix #29315

Closed
wants to merge 1 commit into from

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Mar 26, 2020

in view of Patatrack integration, I propose to add a GPU-capable workflow to the short matrix so that we have it in the baseline tests as well.

A node without a GPU will run this workflow with a CPU setup via a switch producer.

The extra matrix test takes about 10 mins.

@smuzaffar @makortel
Ideally, for the jenkins tests not enabled for GPU I'd expect that this workflow to not load the GPU version. Is this expected to be the case if incidentally the node actually has a GPU available?

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29315/14391

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @slava77 (Slava Krutelyov) for master.

It involves the following packages:

Configuration/PyReleaseValidation

@chayanit, @cmsbuild, @wajidalikhan, @pgunnell, @kpedro88 can you please review it and eventually sign? Thanks.
@makortel, @Martin-Grunewald this is something you requested to watch as well.
@davidlange6, @silviodonato, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor Author

slava77 commented Mar 26, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 26, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5402/console Started: 2020/03/26 15:59

@makortel
Copy link
Contributor

Ideally, for the jenkins tests not enabled for GPU I'd expect that this workflow to not load the GPU version. Is this expected to be the case if incidentally the node actually has a GPU available?

Currently, if the node has a GPU we can run on, we run the GPU version. (not sure if this answer is sufficient or you want to jump in the rabbit hole of "loading")

@slava77
Copy link
Contributor Author

slava77 commented Mar 26, 2020

Currently, if the node has a GPU we can run on, we run the GPU version. (not sure if this answer is sufficient or you want to jump in the rabbit hole of "loading")

apparently I do; but perhaps this is the case for setting up comparisons between different workflow outputs for the purpose of comparing GPU with non-GPU setup.

@makortel
Copy link
Contributor

apparently I do;

The details are: both CPU and GPU modules will be constructed and all non-Event transitions run for both, Event transitions are run only for the "chosen case" (to be improved, #26438).

but perhaps this is the case for setting up comparisons between different workflow outputs for the purpose of comparing GPU with non-GPU setup.

Forcing a GPU-capable workflow (with SwitchProducer) to run on CPU only is easy, just set $CUDA_VISIBLE_DEVICES empty, e.g. CUDA_VISIBLE_DEVICES= cmsRun .... On the other hand I'd think the "GPU vs. CPU" comparison would need a bit more thought than that (and I believe @fwyzard has some plans for that).

@slava77
Copy link
Contributor Author

slava77 commented Mar 26, 2020

Currently, if the node has a GPU we can run on, we run the GPU version. (not sure if this answer is sufficient or you want to jump in the rabbit hole of "loading")

If this begins to flip, we end up in the same situation as with AMD vs Intel comparisons used for the baseline, which was figured out by jenkins tools when we had these options.

@cmsbuild
Copy link
Contributor

+1
Tested at: ebc219e
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-812d00/5402/summary.html
CMSSW: CMSSW_11_1_X_2020-03-26-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-812d00/5402/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-812d00/11634.502_TTbar_14TeV+TTbar_14TeV_TuneCP5_2021_GenSimFull+DigiFull_2021+RecoFullPatatrack_PixelOnlyGPU_2021+HARVESTFullPatatrack_PixelOnlyGPU_2021

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2692110
  • DQMHistoTests: Total failures: 62
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2691729
  • DQMHistoTests: Total skipped: 319
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@slava77
Copy link
Contributor Author

slava77 commented Mar 26, 2020

it looks like this PR went in too early.
The wf definition already exists in the main release, but in the Patatrack release .502 points to GPU-only setup, not on demand. I mistakenly thought that it's the one on-demand.

The on-demand variant is apparently .503, but it does not exist (yet) in the main release.

@slava77 slava77 closed this Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants