Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code to process phase2 version of deepTauID v2p5 [12_5_X] #40723

Merged

Conversation

mbluj
Copy link
Contributor

@mbluj mbluj commented Feb 8, 2023

PR description:

This PR adds code for a phase-2 version of DeepTau discriminator; it is a backport of #40622.
The neural network structure of the new phase-2 version is identical to that of Run-2/3 DeepTau v2p5, therefore code modifications are small. The changes include:

  • Adding scaling parameters and working points for the new training;
  • A new "year" identifier introduced to separate this new phase-2 training and current Run-2/3 v2p5 (2026 vs 2018);
  • The electron input collection used for the new phase2 deepTauID is sum of two collections: slimmedElectrons (gsfElectrons in EB) and slimmedElectronsHGC (HGCal electrons in endcaps). Notes: 1. The latter collection is called slimmedElectronsFromMultiCl in older CMSSW release series (and samples produced with them) which can cause some problems with running on old samples; 2. The same merged electron collection is used by old-style anti-electron tau discriminant for phase-2;

The performance of this new phase-2 DeepTau discriminator is documented in AN-22-090.

The new data-model was merged to cms-data in the following PR: cms-data/RecoTauTag-TrainingFiles#11

This PR allows to run the new version of DeepTau discriminator in private workflows (to analyse phase-2 samples with 12_5_X), but it does not enable the new version of DeepTau in official workflows. Therefore, differences in outputs are not expected.

To run standalone test, one can run the configuration file RecoTauTag/RecoTau/test/runDeepTauIDsOnMiniAOD.py after setting the flag phase2 = True in that file.

PR validation:

Validated with the standalone test mentioned above and with miniAOD workflows.

Backport PR

This is backport of #40622 to 12_5_X to facilitate tau studies for phase-2 with 12_5_X samples.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2023

A new Pull Request was created by @mbluj for CMSSW_12_5_X.

It involves the following packages:

  • PhysicsTools/PatAlgos (xpog, reconstruction)
  • RecoTauTag/RecoTau (reconstruction)

@cmsbuild, @mandrenguyen, @clacaputo, @swertz, @vlimant can you please review it and eventually sign? Thanks.
@rappoccio, @gouskos, @hatakeyamak, @emilbols, @mbluj, @demuller, @seemasharmafnal, @mmarionncern, @missirol, @ahinzmann, @jdolen, @azotz, @jdamgov, @nhanvtran, @gkasieczka, @schoef, @andrzejnovak, @AlexDeMoor, @AnnikaStein, @JyothsnaKomaragiri, @gpetruc, @mariadalfonso this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@mandrenguyen
Copy link
Contributor

type tau

@cmsbuild cmsbuild added the tau label Feb 8, 2023
@mandrenguyen
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 8, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-033c57/30503/summary.html
COMMIT: b588e8c
CMSSW: CMSSW_12_5_X_2023-02-05-0000/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40723/30503/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • Reco comparison results: 20 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3724047
  • DQMHistoTests: Total failures: 25
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3723999
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 50 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 216 log files, 167 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@swertz
Copy link
Contributor

swertz commented Feb 8, 2023

+1

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2023

This pull request is fully signed and it will be integrated in one of the next CMSSW_12_5_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_13_0_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@mbluj
Copy link
Contributor Author

mbluj commented Feb 10, 2023

Just a comment: to make this PR useful also cms-dist of 12_5_X should be updated to contain cms-data state with data-model of the new phase-2 integrated in, i.e. with PR cms-data/RecoTauTag-TrainingFiles#11

@@ -382,8 +382,8 @@ def _add_deepFlavour(process):
toKeep = ['deepTau2017v2p1','deepTau2018v2p5']
)
from Configuration.Eras.Modifier_phase2_common_cff import phase2_common #Phase2 Tau MVA
phase2_common.toModify(tauIdEmbedder.toKeep, func=lambda t:t.append('newDMPhase2v1')) #Phase2 Tau isolation MVA
phase2_common.toModify(tauIdEmbedder.toKeep, func=lambda t:t.append('againstElePhase2v1')) #Phase2 Tau anti-e MVA
_tauIds_phase2 = ['newDMPhase2v1','againstElePhase2v1']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbluj newDMPhase2v1 was not kept in the master version. Just to try to understand: why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbluj newDMPhase2v1 was not kept in the master version. Just to try to understand: why?

It has quite likely something to do with #40724 (comment), but I fail to find the connection (sorry...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to to keep in master only phase-2 deepTauID ('deepTau2026v2p5') and remove/depreciate older, less powerful phase-2 tauIDs ('newDMPhase2v1' and 'againstElePhase2v1').

@perrotta
Copy link
Contributor

please test with cms-sw/cmsdist#8308

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-033c57/30578/summary.html
COMMIT: b588e8c
CMSSW: CMSSW_12_5_X_2023-02-05-0000/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/40723/30578/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

The relvals timed out after 4 hours.

Comparison Summary

Summary:

  • You potentially added 5 lines to the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3724047
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3724017
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 216 log files, 167 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

+1

@perrotta
Copy link
Contributor

merge

@cmsbuild cmsbuild merged commit 1c0198a into cms-sw:CMSSW_12_5_X Feb 10, 2023
@mbluj mbluj deleted the CMSSW_12_5_X_tau-pog_deepTauPh2 branch October 10, 2023 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants