Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTD geometry: speed up MTD reco geometry construction #43124

Merged
merged 2 commits into from
Nov 6, 2023

Conversation

fabiocos
Copy link
Contributor

@fabiocos fabiocos commented Oct 26, 2023

PR description:

The DDD implementation of the MTD reconstruction geometry creation, in the DDCmsMTDConstruction class of Geometry/MTDNumberingBuilder is a killer for the startup CMSSW performances. It turns out that the veto filtering on non sensitive volumes is the key responsible for this, likely due to an excessive number of find of strings in the logical name of processed volumes. Just removing it, without impact on the geometry reconstructed, effectively solves the problem.

As a side note, the DD4hep version, implemented in parallel since 2020, does not suffer of this issue, using a different filtering approach. Unfortunately this is of little help, since we are still in the middle of the transition between different geometry back-ends.

Addressing #43062 .

PR validation:

The MTD reconstruction geometry unit tests pass, showing that the same geometry as before is reconstructed. Both Tracer timing and igprof profiling show a sizeable improvement, from almost 50 down to 3 seconds on the machine used for tests.

Before:

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 26-Oct-2023 17:03:28.265 CEST
26-Oct-2023 17:03:28.26 CEST  ++++ starting: processing event : stream = 0 run = 1 lumi = 1 event = 1 time = 1
26-Oct-2023 17:03:28.26 CEST  ++++++ starting: processing path 'p1' : stream = 0
26-Oct-2023 17:03:28.26 CEST  ++++++++ starting: prefetching before processing event for module: stream = 0 label = 'prod' id = 3
26-Oct-2023 17:03:28.26 CEST  ++++++++ starting: prefetching for esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
26-Oct-2023 17:03:28.26 CEST  ++++++++++ starting: prefetching for esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:03:28.26 CEST  ++++++++++ finished: prefetching for esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:03:28.26 CEST  ++++++++++ starting: processing esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:03:31.47 CEST  ++++++++++ finished: processing esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:03:31.47 CEST  ++++++++ finished: prefetching for esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
26-Oct-2023 17:03:31.47 CEST  ++++++++ starting: processing esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
26-Oct-2023 17:04:08.03 CEST  ++++++++ finished: processing esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
26-Oct-2023 17:04:08.03 CEST  ++++++++ finished: prefetching before processing event for module: stream = 0 label = 'prod' id = 3
26-Oct-2023 17:04:08.03 CEST  ++++++++ starting: processing event for module: stream = 0 label = 'prod' id = 3
%MSG-i GeometricTimingDetAnalyzer:  GeometricTimingDetAnalyzer:prod  26-Oct-2023 17:04:08 CEST Run: 1 Event: 1
Beginning MTD GeometricTimingDet container dump 
%MSG

With this PR:

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 26-Oct-2023 17:22:07.701 CEST
26-Oct-2023 17:22:07.70 CEST  ++++ starting: processing event : stream = 0 run = 1 lumi = 1 event = 1 time = 1
26-Oct-2023 17:22:07.70 CEST  ++++++ starting: processing path 'p1' : stream = 0
26-Oct-2023 17:22:07.70 CEST  ++++++++ starting: prefetching before processing event for module: stream = 0 label = 'prod' id = 3
26-Oct-2023 17:22:07.70 CEST  ++++++++ starting: prefetching for esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
26-Oct-2023 17:22:07.70 CEST  ++++++++++ starting: prefetching for esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:22:07.70 CEST  ++++++++++ finished: prefetching for esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:22:07.70 CEST  ++++++++++ starting: processing esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:22:10.70 CEST  ++++++++++ finished: processing esmodule: label = '' type = XMLIdealGeometryESSource in record = IdealGeometryRecord
26-Oct-2023 17:22:10.70 CEST  ++++++++ finished: prefetching for esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
26-Oct-2023 17:22:10.70 CEST  ++++++++ starting: processing esmodule: label = 'mtdNumberingGeometry' type = MTDGeometricTimingDetESModule in record = IdealGeometryRecord
%MSG-i MTDNumbering:   MTDGeometricTimingDetESModule:mtdNumberingGeometry@callESModule  26-Oct-2023 17:22:10 CEST Run: 1 Event: 1
Top level node = OCMS
%MSG

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43124/37386

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fabiocos (Fabio Cossutti) for master.

It involves the following packages:

  • Geometry/MTDNumberingBuilder (geometry, upgrade)

@Dr15Jones, @mdhildreth, @cmsbuild, @srimanob, @makortel, @civanch, @AdrianoDee, @bsunanda can you please review it and eventually sign? Thanks.
@bsunanda this is something you requested to watch as well.
@antoniovilela, @sextonkennedy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

please test

@@ -1,5 +1,3 @@
//#define EDM_ML_DEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove also this comment of debug? Because I still see #ifdef EDM_ML_DEBUG in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess there is nothing left to test, next time it may be added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand. I see, i.e.

#ifdef EDM_ML_DEBUG
    edm::LogVerbatim("MTDNumbering") << "Module = " << fv.name() << " fullNode = " << fullNode
                                     << " thisNode = " << thisNode;
#endif

in the code. It is not a big deal for me. I just want to confirm if this is by intention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that the preferred way is to define the EDM_ML_DEBUG macro at the compilation command

USER_CXXFLAGS="-DEDM_ML_DEBUG" scram b

https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideMessageLogger#LogDebug

(that doesn't provide file-level control, but the message categories are the intended ones for finer-grained control)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very often use file-level control, and in the past I was used to add commented #define EDM_ML_DEBUG statements in many places. But since the file needs anyway to be edited, adding the statement directly when needed does not really make any difference, and the code looks cleaner.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cba403/35440/summary.html
COMMIT: a427727
CMSSW: CMSSW_13_3_X_2023-10-26-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43124/35440/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • Reco comparison results: 102 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3357400
  • DQMHistoTests: Total failures: 1310
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3356068
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@srimanob
Copy link
Contributor

@bsunanda
Copy link
Contributor

+geometry

@fabiocos
Copy link
Contributor Author

@srimanob no, I would not expect a change while processing identical input samples, especially since the unit tests are ok (comparing the dump of DetIds and corresponding reference positions). I need to have a closer look to understand

@fabiocos
Copy link
Contributor Author

hold

@cmsbuild
Copy link
Contributor

Pull request has been put on hold by @fabiocos
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

@cmsbuild cmsbuild added the hold label Oct 27, 2023
@srimanob
Copy link
Contributor

assign mtd

@felicepantaleo
Copy link
Contributor

thanks @fabiocos for addressing this problem !

@fabiocos
Copy link
Contributor Author

unhold

@cmsbuild cmsbuild removed the hold label Oct 27, 2023
@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cba403/35472/summary.html
COMMIT: 3a7667e
CMSSW: CMSSW_13_3_X_2023-10-27-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/43124/35472/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 4.764.76_ZMuSkim2012D/step2_ZMuSkim2012D.log

Comparison Summary

Summary:

  • You potentially removed 1 lines from the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3357400
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3357375
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@civanch
Copy link
Contributor

civanch commented Oct 28, 2023

+1

all PRs fail with 4.76 WF since yesterday. I would merge this one, because it is absolutely unrelated and there is no sense to re-request tests.

@civanch
Copy link
Contributor

civanch commented Oct 29, 2023

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cba403/35485/summary.html
COMMIT: 3a7667e
CMSSW: CMSSW_13_3_X_2023-10-29-0000/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43124/35485/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 9 lines from the logs
  • Reco comparison results: 14 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3362691
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3362657
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@fabiocos
Copy link
Contributor Author

fabiocos commented Nov 6, 2023

@srimanob is this PR ok for you, or do you see residual issues?

@srimanob
Copy link
Contributor

srimanob commented Nov 6, 2023

+Upgrade

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 6, 2023

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants