Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt to new range for BDT output for Run3 Saturated regression #37100

Merged
merged 2 commits into from
Mar 3, 2022

Conversation

swagata87
Copy link
Contributor

PR description:

This PR changes the range of BDT output of saturated e/gamma regression.

Last time when saturated e/gamma regression was trained (several years ago during Run2), the BDT output range for correction factors was -1 to 3. We have now retrained the regression for Run3, and we have used range 0.2 to 2.0 while training. This is because,

  • the normal(as in, non-saturated) e/gamma regression currently use 0.2-2.0 range, so this way everything becomes aligned.
  • Also not sure why in the past the range started from negative values of correction factors. The current range makes more sense.

Since the range was changed in training, a change in CMSSW is also needed. It is done using era modifier, so that Run2 UL still have the old range [-1,3] compatible with old training, and Run3 gets the new range [0.2,2] compatible with new training. The GT update for regression tags should go into the same release as well. That is progressing in parallel.

Related github issue: cms-sw/cmssw#36886
Talk describing the issues that were fixed + the outstanding issues: https://indico.cern.ch/event/1127851/contributions/4733937/attachments/2390919/4087216/EGM_RegressionTags_Changes.pdf

PR validation:

From CMSSW_12_3_0_pre5, merged this branch and ran the AOD step[1] on this dataset[2] using updated conditions[3] and then ran the miniAOD step[4] on the output AOD. Then made response plots using genmatched slimmedElectrons, and compared that with Run2 regression available in old releases and old GTs. Response looks reasonable and as expected.

compare_Saturated_ZPrimeEE_Response_EcalOnly_Barrel

[1] cmsDriver.py --filein file:/eos/cms/blah.root --fileout file:AOD.root --mc --eventcontent AODSIM --runUnscheduled --customise Configuration/DataProcessing/Utils.addMonitoring --datatier AODSIM --conditions 123X_mcRun3_2021_realistic_Candidate_2022_03_01_08_47_07 --step RAW2DIGI,RECO --geometry DB:Extended --era Run3 --python_filename aod_cfg.py --beamspot Run3RoundOptics25ns13TeVLowSigmaZ --no_exec -n -1

[2] /RelValZpToEE_m6000_14TeV/CMSSW_12_3_0_pre5-123X_mcRun3_2021_realistic_v6-v1/GEN-SIM-DIGI-RAW

[3] https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/123X_mcRun3_2021_realistic_Candidate_2022_03_01_08_47_07/123X_mcRun3_2021_realistic_v9

[4] cmsDriver.py --filein file:/eos/cms/blah.root --fileout file:Mini.root --mc --eventcontent MINIAODSIM --runUnscheduled --customise Configuration/DataProcessing/Utils.addMonitoring --datatier MINIAODSIM --conditions 123X_mcRun3_2021_realistic_Candidate_2022_03_01_08_47_07 --step PAT --geometry DB:Extended --era Run3 --python_filename miniaod_cfg.py --beamspot Run3RoundOptics25ns13TeVLowSigmaZ --no_exec -n -1

This PR is not a backport.
A backport to 12_2 is not meaningful anymore, as 12_2 MC production for 0.5B POG samples has already started. So most probably we won't do any backport of this PR.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2022

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37100/28595

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2022

A new Pull Request was created by @swagata87 (Swagata Mukherjee) for master.

It involves the following packages:

  • RecoEgamma/EgammaTools (reconstruction)

@jpata, @cmsbuild, @clacaputo, @slava77 can you please review it and eventually sign? Thanks.
@Sam-Harper, @jainshilpi, @valsdav, @lgray, @sobhatta, @afiqaize, @wrtabb, @varuns23, @ram1123 this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@tvami
Copy link
Contributor

tvami commented Mar 1, 2022

urgent

@cmsbuild cmsbuild added the urgent label Mar 1, 2022
@jpata
Copy link
Contributor

jpata commented Mar 1, 2022

Per my understanding, it's better if the default is changed directly for Run3 and onwards, rather than applying it with a modifier.

@swagata87
Copy link
Contributor Author

swagata87 commented Mar 1, 2022

I was thinking about the same, but from this comment[1] inside the config it seemed like there was some reason to do it like this, so I used a modifier for now. Hope its fine for this urgent purpose, and we can improve in later release.. I wanted to figure out / better understand why it's done this way, but might need some time..

[1]

#by default we use the regression inappropriate to the main purpose of this release
#life is simplier that way
regressionModifier = regressionModifier94X.clone()

@jpata
Copy link
Contributor

jpata commented Mar 1, 2022

the release is next week, so you have some time still to address it. I'm not able to understand what the comment you referenced above means - can you unpack it?

@jpata
Copy link
Contributor

jpata commented Mar 1, 2022

nevermind my comment above, it was clarified in the ORP mattermost. sorry for the confusion.

  • generally in reco, we try to keep the configurations for different eras as consistent as possible. see e.g. [MiniAOD] Lower AK4 Puppi jets pT cut for Run 3 #36890 (comment). Explicitly, there is no commitment that a new release will reproduce the old behaviour / cuts of e.g. Run2 by default.
  • in this particular case, since this new egamma model was never trained for Run2, clearly one cannot apply the new ranges by default also in run2
  • as was noted, and I concur, that the following logic run3_common.toModify(regressionModifier106XUL seems rather confusing, and the code comment by default we use the regression inappropriate to the main purpose of this release, life is simplier that way does not help.

@jpata
Copy link
Contributor

jpata commented Mar 1, 2022

@cmsbuild please test

@jpata
Copy link
Contributor

jpata commented Mar 1, 2022

@tvami for the "urgent", can you clarify the reasoning, and if it differs from the 12_3_0_pre6 timescale of next week?

@tvami
Copy link
Contributor

tvami commented Mar 1, 2022

Hi @jpata

can you clarify the reasoning, and if it differs from the 12_3_0_pre6 timescale of next week?

this all should actually come with a GT change as well, I'm working on it and will submit the PR soon (condition request came in during ORP).

Further urgency comes into the picture that the JEC is to be updated as well (but it was not yet requested from the JetMet group), i.e. we need another GT change after the one that I'm working on. If this PR merges only on next Tuesday, we'll be late with the follow-up PR to this.

I think testing as it is might not have the outcome we are looking for, since all the private testing was done with the new conditions.

@swagata87
Copy link
Contributor Author

thanks for clarifying, Joosep.
After the jenkins test finish, I will try to improve this part -> run3_common.toModify(regressionModifier106XUL -> which is confusing.
One option is to add another block for Run3, like this:

regressionModifierRun3 = cms.PSet(
    modifierName = cms.string('EGRegressionModifierV3'),  
    .....

This way it would be cleaner, less confusing, but more number of lines to be added compared to what it is now. I guess that's okay, for the sake of clarity..

@tvami
Copy link
Contributor

tvami commented Mar 1, 2022

I'm working on it and will submit the PR soon (condition request came in during ORP).

Here is the PR
#37102

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 1, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-033012/22743/summary.html
COMMIT: 8f685bd
CMSSW: CMSSW_12_3_X_2022-03-01-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37100/22743/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

The workflows 1001.0, 1000.0, 136.88811, 136.874, 136.8311, 136.793, 136.7611, 136.731, 4.22 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 518 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 4000857
  • DQMHistoTests: Total failures: 95
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 4000739
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 204 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

Comment on lines 247 to 259
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)

from Configuration.Eras.Modifier_run3_common_cff import run3_common
run3_common.toModify(regressionModifier106XUL.eleRegs.ecalOnlyMean,
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)

run3_common.toModify(regressionModifier106XUL.phoRegs.ecalOnlyMean,
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could something similar also work? (Please check before implementing, if you decide so)

Suggested change
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)
from Configuration.Eras.Modifier_run3_common_cff import run3_common
run3_common.toModify(regressionModifier106XUL.eleRegs.ecalOnlyMean,
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)
run3_common.toModify(regressionModifier106XUL.phoRegs.ecalOnlyMean,
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)
regressionModifierRun2 = regressionModifier106XUL.clone()
from Configuration.Eras.Modifier_run2_egamma_2016_cff import run2_egamma_2016
from Configuration.Eras.Modifier_run2_egamma_2017_cff import run2_egamma_2017
from Configuration.Eras.Modifier_run2_egamma_2018_cff import run2_egamma_2018
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifierRun2)
regressionModifierRun3 = regressionModifierRun2.clone(
eleRegs = dict(
ecalOnlyMean = dict(
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)
),
phoRegs = dict(
ecalOnlyMean = dict(
rangeMinHighEt = 0.2,
rangeMaxHighEt = 2.0
)
)
)
from Configuration.Eras.Modifier_run3_common_cff import run3_common
run3_common.toReplaceWith(regressionModifier,regressionModifierRun3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's wrong with the original? except for the fact that there we should just have run3_common.toModify(regressionModifier.phoRegs.ecalOnlyMean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original one, ie regressionModifier clones regressionModifier94X,

regressionModifier = regressionModifier94X.clone()

and regressionModifier94X is incompatible with Run3 as it uses EGRegressionModifierV2, while we need EGRegressionModifierV3.
regressionModifier94X = \
cms.PSet( modifierName = cms.string('EGRegressionModifierV2'),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's already replaced at that point in the line above
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)

(run2_egamma_2018 is active in the Run3 era)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's already replaced at that point in the line above
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)

to be more obvious that line could be updated to be
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018 | run3_common).toReplaceWith ...

(my other comment about introducing a run3_egamma instead still applies; although that one is probably more for convenience of finding egamma-related changes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining, Slava. I gave it a try[1].
so the way I originally implemented it, and the way you suggested, both have an unintended effect (discussed here also #37102 (comment)), that new parameter values gets propagated to low pT electron regression too. (I checked it by doing edmConfigDump after implementing [1]).

So now I will try the way Andrea suggested. And I will also try to add run3_egamma as per your suggestion.

[1]

(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018 | run3_common).toReplaceWith(regressionModifier,regressionModifier106XUL)

run3_common.toModify(regressionModifier.eleRegs.ecalOnlyMean,
      rangeMinHighEt = 0.2,
      rangeMaxHighEt = 2.0
)

run3_common.toModify(regressionModifier.phoRegs.ecalOnlyMean,
      rangeMinHighEt = 0.2,
      rangeMaxHighEt = 2.0
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both have an unintended effect (discussed here also #37102 (comment)), that new parameter values gets propagated to low pT electron regression too.

uhm, what about
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018 | run3_common).toReplaceWith(regressionModifier,regressionModifier106XUL.clone()) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77 I don't think it can work.
For Run3 the regressionModifier has to be further modified: it cannot be the same as for Run2.

Comment on lines 247 to 250
(run2_egamma_2016 | run2_egamma_2017 | run2_egamma_2018).toReplaceWith(regressionModifier,regressionModifier106XUL)

from Configuration.Eras.Modifier_run3_common_cff import run3_common
run3_common.toModify(regressionModifier106XUL.eleRegs.ecalOnlyMean,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a run3_egamma instead, just to follow the old patterns?

@swagata87
Copy link
Contributor Author

swagata87 commented Mar 2, 2022

Added run3_egamma era modifier, fixed the issue discussed in #37102 (comment).
Tested on 2000 Z'->EE events from AOD using gedGsfElectrons collection.
Response look as expected.
PR_Saturated_ZPrimeEE_Response_EcalOnly_Barrel

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 2, 2022

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37100/28630

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 2, 2022

Pull request #37100 was updated. @perrotta, @clacaputo, @cmsbuild, @slava77, @jpata, @qliphy, @fabiocos, @davidlange6 can you please check and sign again.

@tvami
Copy link
Contributor

tvami commented Mar 2, 2022

@cmsbuild , please test with #37102

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 2, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-033012/22771/summary.html
COMMIT: 10660de
CMSSW: CMSSW_12_3_X_2022-03-01-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37100/22771/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 4000857
  • DQMHistoTests: Total failures: 13
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 4000821
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 204 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@jpata
Copy link
Contributor

jpata commented Mar 3, 2022

+reconstruction

@francescobrivio
Copy link
Contributor

@cms-sw/orp-l2 do you agree with the latest proposed solution? Both tests here and in #37102 are clean now.

(Sorry for this ping, I just wanted to point out that this should be merged together with #37102).

@perrotta
Copy link
Contributor

perrotta commented Mar 3, 2022

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 3, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants