-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt to new range for BDT output for Run3 Saturated regression #37100
Adapt to new range for BDT output for Run3 Saturated regression #37100
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37100/28595
|
A new Pull Request was created by @swagata87 (Swagata Mukherjee) for master. It involves the following packages:
@jpata, @cmsbuild, @clacaputo, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
urgent |
Per my understanding, it's better if the default is changed directly for Run3 and onwards, rather than applying it with a modifier. |
I was thinking about the same, but from this comment[1] inside the config it seemed like there was some reason to do it like this, so I used a modifier for now. Hope its fine for this urgent purpose, and we can improve in later release.. I wanted to figure out / better understand why it's done this way, but might need some time.. [1] cmssw/RecoEgamma/EgammaTools/python/regressionModifier_cfi.py Lines 238 to 240 in 6558221
|
the release is next week, so you have some time still to address it. I'm not able to understand what the comment you referenced above means - can you unpack it? |
nevermind my comment above, it was clarified in the ORP mattermost. sorry for the confusion.
|
@cmsbuild please test |
@tvami for the "urgent", can you clarify the reasoning, and if it differs from the 12_3_0_pre6 timescale of next week? |
Hi @jpata
this all should actually come with a GT change as well, I'm working on it and will submit the PR soon (condition request came in during ORP). Further urgency comes into the picture that the JEC is to be updated as well (but it was not yet requested from the JetMet group), i.e. we need another GT change after the one that I'm working on. If this PR merges only on next Tuesday, we'll be late with the follow-up PR to this. I think testing as it is might not have the outcome we are looking for, since all the private testing was done with the new conditions. |
thanks for clarifying, Joosep.
This way it would be cleaner, less confusing, but more number of lines to be added compared to what it is now. I guess that's okay, for the sake of clarity.. |
Here is the PR |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-033012/22743/summary.html Comparison SummaryThe workflows 1001.0, 1000.0, 136.88811, 136.874, 136.8311, 136.793, 136.7611, 136.731, 4.22 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons Summary:
|
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL) | ||
|
||
from Configuration.Eras.Modifier_run3_common_cff import run3_common | ||
run3_common.toModify(regressionModifier106XUL.eleRegs.ecalOnlyMean, | ||
rangeMinHighEt = 0.2, | ||
rangeMaxHighEt = 2.0 | ||
) | ||
|
||
run3_common.toModify(regressionModifier106XUL.phoRegs.ecalOnlyMean, | ||
rangeMinHighEt = 0.2, | ||
rangeMaxHighEt = 2.0 | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could something similar also work? (Please check before implementing, if you decide so)
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL) | |
from Configuration.Eras.Modifier_run3_common_cff import run3_common | |
run3_common.toModify(regressionModifier106XUL.eleRegs.ecalOnlyMean, | |
rangeMinHighEt = 0.2, | |
rangeMaxHighEt = 2.0 | |
) | |
run3_common.toModify(regressionModifier106XUL.phoRegs.ecalOnlyMean, | |
rangeMinHighEt = 0.2, | |
rangeMaxHighEt = 2.0 | |
) | |
regressionModifierRun2 = regressionModifier106XUL.clone() | |
from Configuration.Eras.Modifier_run2_egamma_2016_cff import run2_egamma_2016 | |
from Configuration.Eras.Modifier_run2_egamma_2017_cff import run2_egamma_2017 | |
from Configuration.Eras.Modifier_run2_egamma_2018_cff import run2_egamma_2018 | |
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifierRun2) | |
regressionModifierRun3 = regressionModifierRun2.clone( | |
eleRegs = dict( | |
ecalOnlyMean = dict( | |
rangeMinHighEt = 0.2, | |
rangeMaxHighEt = 2.0 | |
) | |
), | |
phoRegs = dict( | |
ecalOnlyMean = dict( | |
rangeMinHighEt = 0.2, | |
rangeMaxHighEt = 2.0 | |
) | |
) | |
) | |
from Configuration.Eras.Modifier_run3_common_cff import run3_common | |
run3_common.toReplaceWith(regressionModifier,regressionModifierRun3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's wrong with the original? except for the fact that there we should just have run3_common.toModify(regressionModifier.phoRegs.ecalOnlyMean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the original one, ie regressionModifier
clones regressionModifier94X
,
regressionModifier = regressionModifier94X.clone() |
and
regressionModifier94X
is incompatible with Run3 as it uses EGRegressionModifierV2
, while we need EGRegressionModifierV3
.cmssw/RecoEgamma/EgammaTools/python/regressionModifier_cfi.py
Lines 183 to 185 in 6558221
regressionModifier94X = \ | |
cms.PSet( modifierName = cms.string('EGRegressionModifierV2'), | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it's already replaced at that point in the line above
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)
(run2_egamma_2018
is active in the Run3
era)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it's already replaced at that point in the line above
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)
to be more obvious that line could be updated to be
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018 | run3_common).toReplaceWith ...
(my other comment about introducing a run3_egamma
instead still applies; although that one is probably more for convenience of finding egamma-related changes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for explaining, Slava. I gave it a try[1].
so the way I originally implemented it, and the way you suggested, both have an unintended effect (discussed here also #37102 (comment)), that new parameter values gets propagated to low pT electron regression too. (I checked it by doing edmConfigDump
after implementing [1]).
So now I will try the way Andrea suggested. And I will also try to add run3_egamma
as per your suggestion.
[1]
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018 | run3_common).toReplaceWith(regressionModifier,regressionModifier106XUL)
run3_common.toModify(regressionModifier.eleRegs.ecalOnlyMean,
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)
run3_common.toModify(regressionModifier.phoRegs.ecalOnlyMean,
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both have an unintended effect (discussed here also #37102 (comment)), that new parameter values gets propagated to low pT electron regression too.
uhm, what about
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018 | run3_common).toReplaceWith(regressionModifier,regressionModifier106XUL.clone())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slava77 I don't think it can work.
For Run3 the regressionModifier has to be further modified: it cannot be the same as for Run2.
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL) | ||
|
||
from Configuration.Eras.Modifier_run3_common_cff import run3_common | ||
run3_common.toModify(regressionModifier106XUL.eleRegs.ecalOnlyMean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be a run3_egamma
instead, just to follow the old patterns?
Added |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37100/28630
|
Pull request #37100 was updated. @perrotta, @clacaputo, @cmsbuild, @slava77, @jpata, @qliphy, @fabiocos, @davidlange6 can you please check and sign again. |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-033012/22771/summary.html Comparison SummarySummary:
|
+reconstruction |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged. |
PR description:
This PR changes the range of BDT output of saturated e/gamma regression.
Last time when saturated e/gamma regression was trained (several years ago during Run2), the BDT output range for correction factors was -1 to 3. We have now retrained the regression for Run3, and we have used range 0.2 to 2.0 while training. This is because,
Since the range was changed in training, a change in CMSSW is also needed. It is done using era modifier, so that Run2 UL still have the old range [-1,3] compatible with old training, and Run3 gets the new range [0.2,2] compatible with new training. The GT update for regression tags should go into the same release as well. That is progressing in parallel.
Related github issue: cms-sw/cmssw#36886
Talk describing the issues that were fixed + the outstanding issues: https://indico.cern.ch/event/1127851/contributions/4733937/attachments/2390919/4087216/EGM_RegressionTags_Changes.pdf
PR validation:
From
CMSSW_12_3_0_pre5
, merged this branch and ran the AOD step[1] on this dataset[2] using updated conditions[3] and then ran the miniAOD step[4] on the output AOD. Then made response plots using genmatchedslimmedElectrons
, and compared that with Run2 regression available in old releases and old GTs. Response looks reasonable and as expected.[1]
cmsDriver.py --filein file:/eos/cms/blah.root --fileout file:AOD.root --mc --eventcontent AODSIM --runUnscheduled --customise Configuration/DataProcessing/Utils.addMonitoring --datatier AODSIM --conditions 123X_mcRun3_2021_realistic_Candidate_2022_03_01_08_47_07 --step RAW2DIGI,RECO --geometry DB:Extended --era Run3 --python_filename aod_cfg.py --beamspot Run3RoundOptics25ns13TeVLowSigmaZ --no_exec -n -1
[2]
/RelValZpToEE_m6000_14TeV/CMSSW_12_3_0_pre5-123X_mcRun3_2021_realistic_v6-v1/GEN-SIM-DIGI-RAW
[3] https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/123X_mcRun3_2021_realistic_Candidate_2022_03_01_08_47_07/123X_mcRun3_2021_realistic_v9
[4]
cmsDriver.py --filein file:/eos/cms/blah.root --fileout file:Mini.root --mc --eventcontent MINIAODSIM --runUnscheduled --customise Configuration/DataProcessing/Utils.addMonitoring --datatier MINIAODSIM --conditions 123X_mcRun3_2021_realistic_Candidate_2022_03_01_08_47_07 --step PAT --geometry DB:Extended --era Run3 --python_filename miniaod_cfg.py --beamspot Run3RoundOptics25ns13TeVLowSigmaZ --no_exec -n -1
This PR is not a backport.
A backport to 12_2 is not meaningful anymore, as 12_2 MC production for 0.5B POG samples has already started. So most probably we won't do any backport of this PR.