Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Update HGCAL pset for Phase-2 HLT menu #39450

Closed
wants to merge 5 commits into from

Conversation

srimanob
Copy link
Contributor

@srimanob srimanob commented Sep 19, 2022

PR description:

This PR updates the pset used in Phase-2 HGCAL HLT. The update follows the pset defined in
https://github.com/cms-sw/cmssw/blob/master/SimCalorimetry/HGCalSimProducers/python/hgcalDigitizer_cfi.py

To complete on PR test, and example of benefit from the new workflow, this PR also introduces

  • a new workfow .76 which the DIGI and HLT:@relval2026 steps are running together instead of HLT:@fake. The new workflow will allow us to develop Phase2 HLT DQM and sequence to be used in future Phase-2 production. This PR is an effort to help on validation as mentioned in Add Validation and DQM to Phase2 HLT WF #39362.
  • a minimal change to have Validation HLT in sequence. There are still some warning massages, this will be cleaned up.

PR validation:

Run test with updated 39434.76.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

None.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39450/32160

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @srimanob (Phat Srimanobhas) for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv, upgrade)
  • HLTrigger/Configuration (hlt)

@Martin-Grunewald, @AdrianoDee, @bbilin, @cmsbuild, @missirol, @srimanob, @kskovpen, @sunilUIET can you please review it and eventually sign? Thanks.
@makortel, @kpedro88, @fabiocos, @Martin-Grunewald, @missirol, @silviodonato, @trtomei, @beaucero, @slomeo this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@srimanob
Copy link
Contributor Author

@cmsbuild please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9f5e09/27657/summary.html
COMMIT: cb60cde
CMSSW: CMSSW_12_6_X_2022-09-18-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39450/27657/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

----- Begin Fatal Exception 19-Sep-2022 21:45:59 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Calling method for module HPSPFTauProducer/'l1tHPSPFTauProducerPF'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: std::vector<l1t::VertexWord>
Looking for module label: L1VertexFinderEmulator
Looking for productInstanceName: l1verticesEmulation

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 19-Sep-2022 21:46:00 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Calling method for module HPSPFTauProducer/'l1tHPSPFTauProducerPF'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: std::vector<l1t::VertexWord>
Looking for module label: L1VertexFinderEmulator
Looking for productInstanceName: l1verticesEmulation

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 19-Sep-2022 21:49:45 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Calling method for module HPSPFTauProducer/'l1tHPSPFTauProducerPF'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: std::vector<l1t::VertexWord>
Looking for module label: L1VertexFinderEmulator
Looking for productInstanceName: l1verticesEmulation

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
Expand to see more relval errors ...

RelVals-INPUT

  • 39434.039434.0_TTbar_14TeV+2026D88+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14INPUT+DigiTrigger+RecoGlobal+HARVESTGlobal/step2_TTbar_14TeV+2026D88+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14INPUT+DigiTrigger+RecoGlobal+HARVESTGlobal.log
  • 39434.10339434.103_TTbar_14TeV+2026D88Aging3000+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14INPUT+DigiTrigger+RecoGlobal+HARVESTGlobal/step2_TTbar_14TeV+2026D88Aging3000+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14INPUT+DigiTrigger+RecoGlobal+HARVESTGlobal.log
  • 39434.539434.5_TTbar_14TeV+2026D88_pixelTrackingOnly+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14INPUT+DigiTrigger+RecoGlobal+HARVESTGlobal/step2_TTbar_14TeV+2026D88_pixelTrackingOnly+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14INPUT+DigiTrigger+RecoGlobal+HARVESTGlobal.log
Expand to see more relval errors ...

@srimanob
Copy link
Contributor Author

The local test (based on CMSSW_12_6_0_pre2) seems to work fine (I use 39434.75, then modify DIGI-HLT step.) It shows some error message, but did not crash.

Error messages that I got, but nothing about l1verticesEmulation as in PR test.

%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:813 FilterTag / Key: L1TkEleDouble12Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEleEE:HLT / 0 CollectionType: St6vectorIN3l1t10TkElectronESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:813 FilterTag / Key: L1TkEleSingle25Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEleEE:HLT / 0 CollectionType: St6vectorIN3l1t10TkElectronESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:813 FilterTag / Key: L1TkEleSingle36Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEleEE:HLT / 0 CollectionType: St6vectorIN3l1t10TkElectronESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:823 FilterTag / Key: L1TkEmDouble24Filter::HLT / 0of2 CollectionTag / Key: l1ctLayer1EG:L1TkEmEB:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:824 FilterTag / Key: L1TkEmDouble24Filter::HLT / 1of2 CollectionTag / Key: l1ctLayer1EG:L1TkEmEE:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:823 FilterTag / Key: L1TkEmSingle37Filter::HLT / 0of2 CollectionTag / Key: l1ctLayer1EG:L1TkEmEB:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:824 FilterTag / Key: L1TkEmSingle37Filter::HLT / 1of2 CollectionTag / Key: l1ctLayer1EG:L1TkEmEE:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:823 FilterTag / Key: L1TkEmSingle51Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEmEB:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:823 FilterTag / Key: L1TkIsoEmDouble12Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEmEB:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:823 FilterTag / Key: L1TkIsoEmSingle22Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEmEB:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG
%MSG-e TriggerSummaryProducerAOD:   TriggerSummaryProducerAOD:hltTriggerSummaryAOD 19-Sep-2022 20:39:56 CEST  Run: 1 Event: 9
Uunknown pid: 2:823 FilterTag / Key: L1TkIsoEmSingle36Filter::HLT / 0of1 CollectionTag / Key: l1ctLayer1EG:L1TkEmEB:HLT / 0 CollectionType: St6vectorIN3l1t4TkEmESaIS1_EE
%MSG

@Martin-Grunewald
Copy link
Contributor

Can we disentangle the update of the menu itself from the update of @relval2026? I believe the latter needs much more work as currently the Phase-2 75e33 menu runs in a very special workflow only.
These errors: %MSG-e TriggerSummaryProducerAOD also point to a misconfiguration...

@srimanob
Copy link
Contributor Author

hold

This PR can hold until the L1T transition is done. Thing should be more clear.

@cmsbuild
Copy link
Contributor

Pull request has been put on hold by @srimanob
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

@cmsbuild cmsbuild added the hold label Sep 20, 2022
@srimanob
Copy link
Contributor Author

By the way, the issue seems to come between 12_6_0_pre2 and last IB (CMSSW_12_6_X_2022-09-19-2300). Testing this code works with 12_6_0_pre2. Strange that we don't spot it with wf 39434.75 in IB test.

Any idea @cms-sw/l1-l2 ? What are just merged and may cause this?

@cecilecaillol
Copy link
Contributor

@srimanob Between the two releases you mention, our module renaming PR (#39244) has been merged. l1ct* objects are now called l1t*, maybe this has not been propagated everywhere to the HLT code, and was not tested in the PR automatic tests

@srimanob
Copy link
Contributor Author

Hi @Martin-Grunewald @cecilecaillol

The issue in PR test seems to happen only when running step-2 (DIGI-L1Track-L1-HLT@relval2026). If HLT@relval2026 is reruning, i.e. in 39434.75 wf, the issue will not happen. This answers why we don't spot any issue in the IB test. This make me worry on what exactly we are running in .75 wf, and why is the behavior of HLT different. I assume it should give the same result, not depend on when it will run.

@srimanob srimanob changed the title Update HGCAL pset for Phase-2 HLT menu [Draft] Update HGCAL pset for Phase-2 HLT menu Sep 22, 2022
@srimanob srimanob marked this pull request as draft September 22, 2022 11:46
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39450/32213

@cmsbuild
Copy link
Contributor

@srimanob
Copy link
Contributor Author

test parameters:

  • workflow = 39434.76
  • relvals_opt = --what cleanedupgrade,standard,highstats,pileup,generator,extendedgen,production,ged,machine,premix

@srimanob
Copy link
Contributor Author

@cmsbuild please test

@srimanob
Copy link
Contributor Author

Example of HLT Validation from this PR: https://tinyurl.com/2kj6bdub
Screen Shot 2565-09-22 at 13 36 55

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9f5e09/27735/summary.html
COMMIT: 96908ae
CMSSW: CMSSW_12_6_X_2022-09-21-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39450/27735/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-9f5e09/39434.76_TTbar_14TeV+2026D88_HLTwDIGI75e33+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

Summary:

  • You potentially added 487 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 23 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3624368
  • DQMHistoTests: Total failures: 25
  • DQMHistoTests: Total nulls: 4
  • DQMHistoTests: Total successes: 3624317
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 338738.658 KiB( 50 files compared)
  • DQMHistoSizes: changed ( 23234.0,... ): 9916.404 KiB HLT/Vertexing
  • DQMHistoSizes: changed ( 23234.0,... ): 9270.895 KiB HLT/Muon
  • DQMHistoSizes: changed ( 23234.0,... ): 7461.538 KiB HLT/BTV
  • DQMHistoSizes: changed ( 23234.0,... ): 5317.474 KiB HLT/SUSYBSM
  • DQMHistoSizes: changed ( 23234.0,... ): 3668.599 KiB HLT/BPH
  • DQMHistoSizes: changed ( 23234.0,... ): 3487.345 KiB HLT/Tracking
  • DQMHistoSizes: changed ( 23234.0,... ): 1747.479 KiB HLT/EGM
  • DQMHistoSizes: changed ( 23234.0,... ): 1280.052 KiB HLT/HCAL
  • DQMHistoSizes: changed ( 23234.0,... ): 156.054 KiB HLT/Higgs
  • DQMHistoSizes: changed ( 23234.0,... ): 55.202 KiB HLT/Exotica
  • DQMHistoSizes: changed ( 23234.0 ): ...
  • Checked 212 log files, 49 edm output root files, 51 DQM output files
  • TriggerResults: found differences in 8 / 50 workflows

@missirol
Copy link
Contributor

@srimanob

Looks like we are adding thousands of empty DQM histograms in the wfs in question. The reason, at least in part, is that most of the HLT offline DQM hard-codes the names of Run-3 HLT collections and/or Paths. It's up to DQM to decide, but it looks like a more careful configuration would be needed.

@srimanob
Copy link
Contributor Author

Hi @missirol

Thanks. I noticed that too. Since it effects the phase2 validation sequence, so it adds more empty histograms.
I think when we agree on what to be done, I will remove DQM part and it can be a follow PR if I can find the way to limit them in general workflows.

@missirol
Copy link
Contributor

I think when we agree on what to be done, I will remove DQM part and it can be a follow PR if I can find the way to limit them in general workflows.

Regarding the scope of this PR, I thought the plan in #39450 (comment) was sensible (without the DQM updates that add many empty histograms).

Of course, the real development will be to (1) update the HLT Phase-2 to avoid any naming clashes with RECO or other steps, and any duplication of modules after this renaming, and (2) add monitoring for the collections in the HLT Phase-2 menu to spot issues like the one in #39323.

@srimanob
Copy link
Contributor Author

Replace with the new PR #39733

@srimanob srimanob closed this Oct 14, 2022
@srimanob srimanob mentioned this pull request Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants