Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code to process phase2 version of deepTauID v2p5 [12_6_X] #40724

Merged

Conversation

mbluj
Copy link
Contributor

@mbluj mbluj commented Feb 8, 2023

PR description:

This PR adds code for a phase-2 version of DeepTau discriminator; it is a backport of #40622.
The neural network structure of the new phase-2 version is identical to that of Run-2/3 DeepTau v2p5, therefore code modifications are small. The changes include:

  • Adding scaling parameters and working points for the new training;
  • A new "year" identifier introduced to separate this new phase-2 training and current Run-2/3 v2p5 (2026 vs 2018);
  • The electron input collection used for the new phase2 deepTauID is sum of two collections: slimmedElectrons (gsfElectrons in EB) and slimmedElectronsHGC (HGCal electrons in endcaps). Notes: 1. The latter collection is called slimmedElectronsFromMultiCl in older CMSSW release series (and samples produced with them) which can cause some problems with running on old samples; 2. The same merged electron collection is used by old-style anti-electron tau discriminant for phase-2;

The performance of this new phase-2 DeepTau discriminator is documented in AN-22-090.

The new data-model was merged to cms-data in the following PR: cms-data/RecoTauTag-TrainingFiles#11

This PR allows to run the new version of DeepTau discriminator in private workflows (to analyse phase-2 samples with 12_5_X), but it does not enable the new version of DeepTau in official workflows. Therefore, differences in outputs are not expected.

To run standalone test, one can run the configuration file RecoTauTag/RecoTau/test/runDeepTauIDsOnMiniAOD.py after setting the flag phase2 = True in that file.

PR validation:

Validated with the standalone test mentioned above and with miniAOD workflows.

Backport PR

This is backport of #40622 to 12_6_X to facilitate tau studies for phase-2 with 12_5_X samples.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2023

A new Pull Request was created by @mbluj for CMSSW_12_6_X.

It involves the following packages:

  • PhysicsTools/PatAlgos (xpog, reconstruction)
  • RecoTauTag/RecoTau (reconstruction)

@cmsbuild, @mandrenguyen, @clacaputo, @swertz, @vlimant can you please review it and eventually sign? Thanks.
@rappoccio, @gouskos, @hatakeyamak, @emilbols, @mbluj, @demuller, @seemasharmafnal, @mmarionncern, @missirol, @ahinzmann, @jdolen, @azotz, @jdamgov, @nhanvtran, @gkasieczka, @schoef, @andrzejnovak, @AlexDeMoor, @AnnikaStein, @JyothsnaKomaragiri, @gpetruc, @mariadalfonso this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@swertz
Copy link
Contributor

swertz commented Feb 8, 2023

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-529c7d/30512/summary.html
COMMIT: 46cd11a
CMSSW: CMSSW_12_6_X_2023-02-08-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40724/30512/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 6 lines from the logs
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460357
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460332
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 206 log files, 158 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@swertz
Copy link
Contributor

swertz commented Feb 9, 2023

I'm surprised to not see any of the differences in TauID products for Phase2 MiniAOD workflow comparisons, that were visible (and expected) in the original PR #40622

Do you have an explanation for that @mbluj ?

@mbluj
Copy link
Contributor Author

mbluj commented Feb 9, 2023

I'm surprised to not see any of the differences in TauID products for Phase2 MiniAOD workflow comparisons, that were visible (and expected) in the original PR #40622

Do you have an explanation for that @mbluj ?

Yes, it is expected. In backported versions of the #40622 all code changes (including C++ code and python tool to generate configurations for tauIDs) are on the place, but the new tauID is not enabled in official workflows. It is to not break "non-changing" policy in production releases. Therefore, the backports allow users to run the new tauID in their private workflows, e.g. private re-miniAOD or nanoAOD/ntuple production on top of miniAOD, but official workflows are not affected.

@swertz
Copy link
Contributor

swertz commented Feb 9, 2023

Thanks for clarifying! Out of curiosity, where do I see that the tauID is not enabled? They are clearly enabled in master since we saw the changes in the tests of #40622 And since the code of the backport is the same and enables the new ID for the phase2_common modifer, I don't understand how it is disabled...

@mbluj
Copy link
Contributor Author

mbluj commented Feb 9, 2023

Thanks for clarifying! Out of curiosity, where do I see that the tauID is not enabled? They are clearly enabled in master since we saw the changes in the tests of #40622 And since the code of the backport is the same and enables the new ID for the phase2_common modifer, I don't understand how it is disabled...

In the original and this backport PR a way in which phase-2 tauIDs are enabled in PhysicsTools/PatAlgos/python/slimming/miniAOD_tools.py is slightly changed: previously each of them was added separately, now a list of phase-2 tauIDs is defined and then it is passed to phase2_common.toModify, see here: https://github.com/cms-sw/cmssw/pull/40724/files#diff-c1409af99a0dfee10534eb0c4e985d3b4dd669b5c81cd302efdf760ee0ce5c41L391-L392
The difference between the backport and original PRs comes from definition of the list - in case of the original it contains new the phase-2 deepTauID, while in the backport the list contains older tauIDs, thus no change (=disabled). This disabling is introduced in this commit: 46cd11a

@mandrenguyen
Copy link
Contributor

+1

@swertz
Copy link
Contributor

swertz commented Feb 9, 2023

+1

Thanks @mbluj I'd overlooked that bit!

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2023

This pull request is fully signed and it will be integrated in one of the next CMSSW_12_6_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_13_0_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@mbluj
Copy link
Contributor Author

mbluj commented Feb 10, 2023

Just a comment: to make this PR useful also cms-dist of 12_6_X should be updated to contain cms-data state with data-model of the new phase-2 integrated in, i.e. with PR cms-data/RecoTauTag-TrainingFiles#11

@perrotta
Copy link
Contributor

please test with cms-sw/cmsdist#8309

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-529c7d/30577/summary.html
COMMIT: 46cd11a
CMSSW: CMSSW_12_6_X_2023-02-10-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40724/30577/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

The relvals timed out after 4 hours.

  • 1325.51325.5_ProdZEE_13_reminiaodINPUT+ProdZEE_13_reminiaodINPUT+REMINIAOD_mc2016+HARVESTDR2_REMINIAOD_mc2016/step2_ProdZEE_13_reminiaodINPUT+ProdZEE_13_reminiaodINPUT+REMINIAOD_mc2016+HARVESTDR2_REMINIAOD_mc2016.log

Comparison Summary

Summary:

  • You potentially added 11 lines to the logs
  • Reco comparison results: 13 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460477
  • DQMHistoTests: Total failures: 12
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460443
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 206 log files, 158 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

+1

@perrotta
Copy link
Contributor

merge

@cmsbuild cmsbuild merged commit 21c5ec4 into cms-sw:CMSSW_12_6_X Feb 10, 2023
@mbluj mbluj deleted the CMSSW_12_6_X_tau-pog_deepTauPh2 branch October 10, 2023 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants