Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed finecalo performance #34597

Merged
merged 2 commits into from
Aug 7, 2021

Conversation

tklijnsma
Copy link

PR description:

This patch streamlines the finecalo patch (PR 32673).

  • The lookups of the boundary-crossing parent of a track are no longer done in the createNewHit method of CaloSD, but only once per track.
  • Boundary-crossing parent lookups are also cached in a map in CaloSD, which leads to a lot less lookups. (A 'lookup' consists of traversing the track history via the SimTrackManager, which is expensive.)
  • Some needless attributes from the CaloHitID classes are removed again.
  • Fixed a bug where in very rare cases hits would still be assigned to the primary rather than the boundary crossing parent.
  • Line 309 in CaloTrkProcessing.cc (introduced in PR 33056) is disabled for finecalo, as it changes the behavior of finecalo.

The CPU time needed per event is about 20% more than default CMSSW (hard to quantify the exact time difference due to slightly varying usage of the compute nodes I am using). This is a significant improvement w.r.t. to the finecalo patch 1, which was anywhere between 100% and 500% slower depending on the simulated event topology (it used to scale with the depth of the decay tree - the cache reduces the complexity of the algorithm immensely).

Plot of the RSS as a function of time (orange = this patch, green = the previous patch, blue = default CMSSW):

rss_over_time_ttbar

PR validation:

I checked event displays of ttbar and of a tau gun, and it looks in order.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34597/24154

ERROR: Build errors found during clang-tidy run.

SimG4CMS/Calo/src/CaloSD.cc:590:1: error: version control conflict marker in file [clang-diagnostic-error]
<<<<<<< HEAD
^
SimG4CMS/Calo/src/CaloSD.cc:602:1: error: version control conflict marker in file [clang-diagnostic-error]
<<<<<<< HEAD
^
SimG4CMS/Calo/src/CaloSD.cc:606:5: error: use of undeclared identifier 'hitBookkeepingFineCalo' [clang-diagnostic-error]
    hitBookkeepingFineCalo(aStep, theTrack, aHit);
    ^
Suppressed 1448 warnings (1447 in non-user code, 1 with check filters).
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34597/24155

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34597/24156

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-34597/24158

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @tklijnsma for master.

It involves the following packages:

  • SimG4CMS/Calo (simulation)

@cmsbuild, @civanch, @mdhildreth can you please review it and eventually sign? Thanks.
@makortel, @cvuosalo, @rovere, @bsunanda, @thomreis, @felicepantaleo, @simonepigazzini, @fabiocos, @slomeo this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy, @perrotta you are the release manager for this.

cms-bot commands are listed here

@lgray
Copy link
Contributor

lgray commented Jul 22, 2021

@cmsbuild please test

@perrotta
Copy link
Contributor

perrotta commented Aug 2, 2021

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 2, 2021

-1

Failed Tests: RelVals
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-36ea04/17407/summary.html
COMMIT: 7ccd430
CMSSW: CMSSW_12_1_X_2021-08-01-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/34597/17407/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

----- Begin Fatal Exception 02-Aug-2021 10:09:28 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 320822 lumi: 17 event: 26886180 stream: 0
   [1] Running path 'dqmofflineOnPAT_step'
   [2] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
   [3] Prefetching for module PATMuonSlimmer/'slimmedMuons'
   [4] Prefetching for module PATMuonSelector/'selectedPatMuons'
   [5] Prefetching for module PATMuonProducer/'patMuons'
   [6] Prefetching for module CITKPFIsolationSumProducerForPUPPI/'muonPUPPIIsolation'
   [7] Calling method for module PATPackedCandidateProducer/'packedPFCandidates'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::ValueMap<float>
Looking for module label: puppi
Looking for productInstanceName: 

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 02-Aug-2021 10:20:46 CEST-----------------------
An exception of category 'NoProxyException' occurred while
   [0] Running EventSetup component TrackerGeometricDetESModule/'trackerNumberingGeometry
Exception Message:
No data of type "DDCompactView" with label "" in record "IdealGeometryRecord"
 Please add an ESSource or ESProducer to your job which can deliver this data.
----- End Fatal Exception -------------------------------------------------
  • 11634.91211634.912_TTbar_14TeV+2021_DD4hepDB+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/step1_TTbar_14TeV+2021_DD4hepDB+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA.log
Expand to see more relval errors ...

@srimanob
Copy link
Contributor

srimanob commented Aug 2, 2021

+Upgrade

As mentioned in #34599 (comment)
finecarlo should be used for selected workflow for now.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 2, 2021

This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy, @perrotta (and backports should be raised in the release meeting by the corresponding L2)

@tklijnsma
Copy link
Author

@silviodonato, @dpiparo, @qliphy, @perrotta Could one of you sign this?

@perrotta
Copy link
Contributor

perrotta commented Aug 6, 2021

test parameters:

@perrotta
Copy link
Contributor

perrotta commented Aug 6, 2021

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 6, 2021

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-36ea04/17605/summary.html
COMMIT: 7ccd430
CMSSW: CMSSW_12_1_X_2021-08-05-2300/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/34597/17605/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 11601.011601.0_SingleElectronPt10+2021+SingleElectronPt10_pythia8_GenSimINPUT+Digi+Reco+HARVEST+ALCA/step2_SingleElectronPt10+2021+SingleElectronPt10_pythia8_GenSimINPUT+Digi+Reco+HARVEST+ALCA.log
  • 11602.011602.0_SingleElectronPt35+2021+SingleElectronPt35_pythia8_GenSimINPUT+Digi+Reco+HARVEST+ALCA/step2_SingleElectronPt35+2021+SingleElectronPt35_pythia8_GenSimINPUT+Digi+Reco+HARVEST+ALCA.log
  • 11603.011603.0_SingleElectronPt1000+2021+SingleElectronPt1000_pythia8_GenSimINPUT+Digi+Reco+HARVEST+ALCA/step2_SingleElectronPt1000+2021+SingleElectronPt1000_pythia8_GenSimINPUT+Digi+Reco+HARVEST+ALCA.log
Expand to see more relval errors ...

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 39
  • DQMHistoTests: Total histograms compared: 2999410
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2999387
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 38 files compared)
  • Checked 165 log files, 37 edm output root files, 39 DQM output files
  • TriggerResults: no differences found

@tklijnsma
Copy link
Author

tklijnsma commented Aug 6, 2021

@perrotta Many thanks - Looking through these errors it seems like this is unrelated to this PR, right? Could you run the tests again, or possible merge without another test? (tests succeeded 5 days ago).

@perrotta
Copy link
Contributor

perrotta commented Aug 7, 2021

+1

  • Computing performance of the 'fine calorimetry' are improved since the original version originally implemented with finecalo patches v1 (rebased) #32673
  • Physics performance is not shown, i.e. there is no mention here whether these updates affect the output in the 'fine calorimetry' mode. However, that mode must be enabled by a dedicated modifier, in a few selected workflows only, and standard workflows are not affected: we assume developers and @cms-sw/simulation-l2 will take care of the physics results.

@perrotta
Copy link
Contributor

perrotta commented Aug 7, 2021

merge

@cmsbuild cmsbuild merged commit 68591f5 into cms-sw:master Aug 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants