Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[14_0_X SIM] ZDC problem in hlt_mc_HIon test #43582

Open
civanch opened this issue Dec 16, 2023 · 44 comments
Open

[14_0_X SIM] ZDC problem in hlt_mc_HIon test #43582

civanch opened this issue Dec 16, 2023 · 44 comments

Comments

@civanch
Copy link
Contributor

civanch commented Dec 16, 2023

In the PR #43576 a production of ZDC hits is enabled, which make a problem for hlt_mc_HIon addOn test. Temporary ZDC hits are masked until the problem will be solved.

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 16, 2023

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @civanch Vladimir Ivantchenko.

@Dr15Jones, @smuzaffar, @sextonkennedy, @antoniovilela, @makortel, @rappoccio can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@civanch
Copy link
Contributor Author

civanch commented Dec 16, 2023

assign simulation

@cmsbuild
Copy link
Contributor

New categories assigned: simulation

@civanch,@mdhildreth you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Contributor

mmusich commented Dec 16, 2023

@cms-sw/hlt-l2 FYI

@stahlleiton
Copy link
Contributor

I am currently seeing a similar problem running the HI workflow in latest CMSSW_14_1_X IB:

An exception of category 'Conditions not found' occurred while
[0] Processing Event run: 1 lumi: 1 event: 33 stream: 0
[1] Running path 'HLTAnalyzerEndpath'
[2] Prefetching for module L1TRawToDigi/'hltGtStage2Digis'
[3] Prefetching for module RawDataCollectorByLabel/'rawDataCollector'
[4] Prefetching for module SiStripDigiToRawModule/'SiStripDigiToRaw'
[5] Calling method for module MixingModule/'mix'
Exception Message:
Unavailable Conditions of type HcalMCParams for cell (0x54000140) (Det 5:5 subdet 2:2 ZDC+ UNKNOWN 0,0)

To reproduce the problem I did:

cmsrel CMSSW_14_1_X_2024-03-29-1100
cd CMSSW_14_1_X_2024-03-29-1100/src/
cmsenv

cmsDriver.py Configuration/Generator/python/Starlight_DoubleDiffraction_5p36TeV_cfi.py -s LHE,GEN,SIM -n 40 --conditions auto:phase1_2023_realistic_hi --beamspot Realistic2022PbPbCollision --datatier GEN-SIM --eventcontent RAWSIM --era Run3_pp_on_PbPb_2023 --geometry DB:Extended --relval 9000,150 --fileout file:step1.root

cmsDriver.py step2 -s DIGI:pdigi_hi_nogen,L1,DIGI2RAW,HLT:@Fake2 --conditions auto:phase1_2023_realistic_hi --datatier GEN-SIM-DIGI-RAW-HLTDEBUG --eventcontent FEVTDEBUGHLT --era Run3_pp_on_PbPb_2023 -n -1 --pileup HiMixNoPU --filein file:step1.root --fileout file:step2.root

@mmusich
Copy link
Contributor

mmusich commented Mar 29, 2024

@cms-sw/hcal-dpg-l2 FYI

@abdoulline
Copy link

abdoulline commented Mar 30, 2024

This is to explicitly include Sunanda ( @bsunanda ).
Since Nov.2023 there is a new HcalZDCDetId definition for Run3
Which is not yet used (due to some issues) neither for the Geometry initialization nor for DB conditions (to add new Run3 ZDC channels). Should be fixed soon, as was discussed elsewhere (email, HCAL DPG meeting).

@makortel
Copy link
Contributor

Starting from CMSSW_14_1_X_2024-04-24-2300 the

----- Begin Fatal Exception 26-Apr-2024 10:58:14 CEST-----------------------
An exception of category 'Conditions not found' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 8 stream: 1
   [1] Running path 'HLTAnalyzerEndpath'
   [2] Prefetching for module L1TRawToDigi/'hltGtStage2Digis'
   [3] Prefetching for module RawDataCollectorByLabel/'rawDataCollector'
   [4] Prefetching for module SiStripDigiToRawModule/'SiStripDigiToRaw'
   [5] Calling method for module MixingModule/'mix'
Exception Message:
Unavailable Conditions of type HcalMCParams for cell (0x54000140) (Det 5:5 subdet 2:2 ZDC+ UNKNOWN 0,0)
----- End Fatal Exception -------------------------------------------------

occurs frequently (but not always) in the step2 of workflows 180.1 and 181.1 (that were added/enabled in that IB).

@hjbossi
Copy link
Contributor

hjbossi commented May 1, 2024

Commenting here to flag that experts from HCAL and ZDC side are aware (as flagged above by @abdoulline, @bsunanda) and are still working towards a solution.

@perrotta
Copy link
Contributor

@civanch @abdoulline all.
The issue is reappearing in the production of the premix samples for the 2024MC campaign, see gitlab
I read in the initial post of this issue that "Temporary ZDC hits are masked until the problem will be solved": what does it mean? I.e., was it merged a PR that actually masked those "temporary ZDC hits"? And, if so: why such protection is apparently not being effective now?

@bsunanda
Copy link
Contributor

bsunanda commented Sep 20, 2024 via email

@abdoulline
Copy link

abdoulline commented Sep 20, 2024

I must admit don't have a clear understanding of what's happening...
Initialization of HcalMCParams require all the specified (as valid) cells to be involved, including what's specified as ZDC valid ones, no matter if ZDC SimHits or Digis are present or not...
This cell (0x54000140) is illegal...

I wonder if it may (or may not) be related to the absence in 14_0_X of a small fix of HcalZDCDetId #45033 which was submitted (as new ZDC-only related) to 14_1_X...

@perrotta
Copy link
Contributor

I must admit don't have a clear understanding of what's happening... This cell (0x54000140) is illegal...

I wonder if it may (or may not) be related to the absence in 14_0_X of a small fix of HcalZDCDetId #45033 which was submitted (as new ZDC-only related) to 14_1_X...

Indeed, that fix was never backported in 14_0_X...
We can try running with it.
Your understanding is that if there was not such an illegal cell, the protection should have worked in 14_0_X: do I understand it correctly?

@bsunanda
Copy link
Contributor

bsunanda commented Sep 20, 2024 via email

@abdoulline
Copy link

@perrotta
which kind of protection? 🤔

@civanch
Copy link
Contributor Author

civanch commented Sep 20, 2024

Let us do. It should not affect other detectors.

@perrotta
Copy link
Contributor

I've prepared the backport in #46070

@perrotta
Copy link
Contributor

@perrotta which kind of protection? 🤔

I refer to what @civanch wrote in the issue description: "Temporary ZDC hits are masked until the problem will be solved."

@abdoulline
Copy link

@perrotta which kind of protection? 🤔

I refer to what @civanch wrote in the issue description: "Temporary ZDC hits are masked until the problem will be solved."

@civanch could you remind me how it was done?

@civanch
Copy link
Contributor Author

civanch commented Sep 20, 2024

@abdoulline , there is era dependent option CMStoZDCtransport = True/False.

@bsunanda
Copy link
Contributor

bsunanda commented Sep 20, 2024 via email

@abdoulline
Copy link

abdoulline commented Sep 20, 2024

@abdoulline , there is era dependent option CMStoZDCtransport = True/False.

OK, I see. But absence of ZDC SimHits does not prevent HcalMCParams to be initialized for all the (valid) ZDC (from Geometry initialization) cells...
The problem why the cell in question is illegal...

@civanch
Copy link
Contributor Author

civanch commented Sep 20, 2024

If it is "false" no particle can go throuw the volume CMStoZDC - all are killed.

@abdoulline
Copy link

abdoulline commented Sep 20, 2024

I believe there is something that is percolating not right. The masking in HcalZDCDetId need not be changed. But i do not understand the issue since I did not follow it from the beginning.

@bsunanda

May be that ZDCDetId (not fixed in 14_0_X) is not the culprit, as it's related to RPD channels, which should not exist in 14_0_X anyway.

I've just dumped and parsed and re-redumped txt input of existing DB conditions (HcalMCParams) with legacy ZDC (22 ch. EM+HAD+LUM) in 14_0_15 and there is no problem, no illegal numbers.
Both old (in DB) and new (after txt re-dumping) ZDCDetIds are legal and there is no "extra" (Run3) RPD channels...

In the illegal ZDCDetId no ZDC section is defined...
https://cmssdt.cern.ch/lxr/source/DataFormats/HcalDetId/interface/HcalZDCDetId.h#0033
screen

Wild guess - may be some old (12_X/13_X) Digi are involved somehow? - No, at the first glance:
https://cmssdt.cern.ch/lxr/source/Configuration/PyReleaseValidation/python/relval_standard.py#0878

@abdoulline
Copy link

abdoulline commented Sep 20, 2024

@bsunanda
I've reproduced the issue even with a non-standard setup:
on lxplu9 in CMSSW_14_2_0_pre1 in 33th event of the step2 of wf 180.1 [1] (jobs run fast) :

Begin processing the 33rd record. Run 1, Event 33, LumiSection 1 on stream 0 at 20-Sep-2024 15:19:40.916 CEST
----- Begin Fatal Exception 20-Sep-2024 15:19:40 CEST-----------------------
An exception of category 'Conditions not found' occurred while
[0] Processing Event run: 1 lumi: 1 event: 33 stream: 0
[1] Running path 'FEVTDEBUGHLToutput_step'
[2] Prefetching for module PoolOutputModule/'FEVTDEBUGHLToutput'
[3] Prefetching for module CSCTriggerPrimitivesProducer/'simCscTriggerPrimitiveDigis'
[4] Prefetching for module CSCDigiProducer/'simMuonCSCDigis'
[5] Calling method for module MixingModule/'mix'
Exception Message:
Unavailable Conditions of type HcalMCParams for cell (0x54000140) (Det 5:5 subdet 2:2 ZDC+ UNKNOWN 0,0)


[1]

cmsDriver.py Configuration/Generator/python/Starlight_DoubleDiffraction_5p36TeV_cfi.py -s LHE,GEN,SIM -n 40 --conditions auto:phase1_2023_realistic_hi --beamspot Realistic2022PbPbCollision --datatier GEN-SIM --eventcontent RAWSIM --era Run3_pp_on_PbPb_2023 --geometry DB:Extended --relval 9000,150 --fileout file:step1.root

cmsDriver.py step2 -s DIGI:pdigi_hi_nogen,L1,DIGI2RAW --conditions auto:phase1_2023_realistic_hi --datatier GEN-SIM-DIGI-RAW-HLTDEBUG --eventcontent FEVTDEBUGHLT --era Run3_pp_on_PbPb_2023 -n -1 --pileup HiMixNoPU --filein file:step1.root --fileout file:step2.root

@abdoulline
Copy link

abdoulline commented Sep 20, 2024

@bsunanda
And with the ZDC exclusion from Digitization by commenting the line [1] (in addition to checks, which we've added recently to this module in your recent ZDC-related PRs) , the above step2 goes to the end.

Caveat:
this "happy end" may be not real, as [1] may change the Digi rndm sequence (?).
And [1] may just "swipes the dust under the carpet"...


[1] line 208
https://github.com/cms-sw/cmssw/compare/master...abdoulline:cmssw:ZDCDigitizer_inactivation?expand=1#diff-36c407c1c93d9f2b68d7469a3a54bf684935b30b22a2f454b9b4f9505a4d4871R208

@abdoulline
Copy link

@bsunanda we need your insight, I'm afraid, about it...
#43582 (comment)

@vlimant
Copy link
Contributor

vlimant commented Oct 4, 2024

this issue is hitting us again bit time for launching the Summer24-24 premix library, and we ought to find a solution to this. I understand that reproducibility is an issue, and we might have to just roll back anything related to ZDC to get out of this, if no solution can be found

@civanch
Copy link
Contributor Author

civanch commented Oct 4, 2024

I may be wrong but ZDC is not enabled for pp runs. Do i understand this correctly? Or this was only in past?

If my understanding is true, then for pp simulations in 2024 ZDC is not need and also ZDC is not needed for production of premix library. ZDC should be enabled only for HI simulation. When we enable ZDC for pp we get a significant factor slow down simulation (about 3-5 if my memory is correct). This happens for MinBias simulation because high energy hadrons hit ZDC and a full very energetic shower is simulated (without Russian roulette or other short-cut.

So, I am not sure if we should backport ZDC software to 14_0.

@hjbossi
Copy link
Contributor

hjbossi commented Oct 4, 2024

In 2023, the ZDC was enabled only in the case of the pp reference run occurring just before the heavy-ion run (which will also be the case in 2024). I am not sure if the reference run is handled differently in MC, but we have no need for ZDC pp simulations at 13.6 TeV as the ZDC was not included in these runs.

Hopefully this helps, especially if this is slowing down the simulation.

@abdoulline
Copy link

abdoulline commented Oct 5, 2024

@bsunanda some additional recent info/observations :

(1) CMSSW_14_0_X_2024-10-05-1100 (the most recent 14_0_X IB) on lxplus8:
wf 180.1 step2 crashes in already known ev. 33 exactly the same way, as was earlier reported in 14_0_X, 14_1_X and 14_2_0_pre1, yet without the most recent ZDC Geometry-related updates, exposing illegal ZDCDetId:
Unavailable Conditions of type HcalMCParams for cell (0x54000140) (Det 5:5 subdet 2:2 ZDC+ UNKNOWN 0,0)

(2) CMSSW_14_0_X_2024-10-05-1100 + pending PR #46246 (backport of what was recently merged into 14_1_X & 14_2_X):
wf 180.1 step2 goes smoothly though 1000 ev (increased from the default 40 ev)
Looks like an indication of improved ZDCDetId handling (?)

(3) CMSSW_14_0_X_2024-10-05-1100 + commented ZDCDigitizer:
OK, as in case (2) above

@vlimant
Copy link
Contributor

vlimant commented Oct 6, 2024

just to be a bit clearer : 14.0.17 pilot for the premix library is failing with

cmsRun1
Fatal Exception (Exit Code: 8001)

    An exception of category 'Conditions not found' occurred while
       [0] Processing  Event run: 1 lumi: 78 event: 77465 stream: 2
       [1] Running path 'PREMIXoutput_step'
       [2] Prefetching for module PoolOutputModule/'PREMIXoutput'
       [3] Calling method for module MixingModule/'mix'
    Exception Message:
    Unavailable Conditions of type HcalMCParams for cell (0x54000140) (Det 5:5 subdet 2:2 ZDC+ UNKNOWN 0,0)

cmsDriver.py Configuration/GenProduction/python/PPD-RunIIISummer24PrePremix-00002-fragment.py --fileout file:PPD-RunIIISummer24PrePremix-00002.root --pileup_input "dbs:/MinBias_TuneCP5_13p6TeV-pythia8/RunIII2024Summer24GS-140X_mcRun3_2024_realistic_v20-v1/GEN-SIM" --mc --eventcontent PREMIX --pileup 2024_25ns_RunIII2024Summer24_PoissonOOTPU --datatier PREMIX --conditions 140X_mcRun3_2024_realistic_v21 --step GEN,SIM,DIGI --procModifiers premix_stage1 --nThreads 2 --geometry DB:Extended --era Run3_2024

with Configuration/GenProduction/python/PPD-RunIIISummer24PrePremix-00002-fragment.py

with /MinBias_TuneCP5_13p6TeV-pythia8/RunIII2024Summer24GS-140X_mcRun3_2024_realistic_v20-v1/GEN-SIM produced in 14.0.13 using

cmsDriver.py Configuration/GenProduction/python/PPD-RunIII2024Summer24GS-00002-fragment.py --fileout file:PPD-RunIII2024Summer24GS-00002.root --mc --eventcontent RAWSIM --datatier GEN-SIM --conditions 140X_mcRun3_2024_realistic_v20 --beamspot DBrealistic --step GEN,SIM --nThreads 4 --geometry DB:Extended --era Run3_2024

with Configuration/GenProduction/python/PPD-RunIII2024Summer24GS-00002-fragment.py

which is somehow the topic of the issue here, and hence I am saying that whichever ZDC code was aded in 14.0 is interfering and preventing 14.0 to be used for pp simulation.

whichever solution you, the experts, will come up with to get this solved is good for us, as long as 14.0 is usable for pp simulation, that we need to launch urgently.

@abdoulline
Copy link

abdoulline commented Oct 6, 2024

I guess it's unrealistic to reproduce the issue in a private setup 🤔
[0] Processing Event run: 1 lumi: 78 event: 77465 stream: 2

I'm running it in 14_0_17, it takes ~3-4 s/ev on lxplus8...

@perrotta
Copy link
Contributor

perrotta commented Oct 7, 2024

@abdoulline would you like to pursue your suggestion
#43582 (comment) to exclude ZDC from Digitization? Can a PR be made with it in 14_0_X?
I would avoid adding 1867 lines of code in a closed release just to add some ZDC stuff which is not supposed to be used in pp productions with CMSW_14_0_X

@abdoulline
Copy link

@perrotta
Should be able to submit the PR in question around noon...

@bsunanda
Copy link
Contributor

bsunanda commented Oct 7, 2024 via email

@vlimant
Copy link
Contributor

vlimant commented Oct 7, 2024

what one can do, and will likely exhibit the same issue deterministically is to run the DIGI step of the MB, since anything that is failing during the DIGI step of the premix sample is due to the MB content.

@abdoulline
Copy link

...In the meantime (just in case) ZDC Digitizer removal submitted to 14_0_X #46282

@hjbossi
Copy link
Contributor

hjbossi commented Oct 8, 2024

Just to mention this also here: #46286 w/ backport to 14_0_X #46282

@perrotta
Copy link
Contributor

perrotta commented Oct 8, 2024

Just to mention this also here: #46286 w/ backport to 14_0_X #46282

@hjbossi backports are not yet there. #46282 is the original proposal for 14_0_X by @abdoulline , which is now close because it will be superseded by the future backports of Sunanda's #46286

@malbouis
Copy link
Contributor

malbouis commented Oct 8, 2024

Hello, I have some very basic questions, sorry if they were already addressed and I missed it. I will list them below. I think it could also help us understand the overall picture of what is going on.

  1. Why is the ZDC geometry included in the SIM of a release that does not use it?
  2. In principle, the ZDC geometry should only be included in the simulation of a release with which we collected Heavy Ion data. Wouldn't it make more sense to have it under a heavy ion Era and really only use it in the heavy ion releases?
  3. If the implementation of the geometry under a heavy ion era would be implemented in a timely manner, could we proceed like this? Otherwise, wouldn't it be better to just kill any ZDC simhits in 14_0_X, like the PR proposed by Salavat?

@vlimant
Copy link
Contributor

vlimant commented Oct 8, 2024

I'd rather have #46282 to be on the safe side for 14.0 usability in a short time.

@hjbossi
Copy link
Contributor

hjbossi commented Oct 8, 2024

Just to mention this also here: #46286 w/ backport to 14_0_X #46282

@hjbossi backports are not yet there. #46282 is the original proposal for 14_0_X by @abdoulline , which is now close because it will be superseded by the future backports of Sunanda's #46286

Ah sorry, you are correct. The correct link is now available here: #46300

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests