Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PileupJetId, Puppi] Pileup ID input variable fix, puppi weight ValueMap access, optional photon protection for existing puppi weights #40762

Merged
merged 9 commits into from
Feb 21, 2023

Conversation

jshin96
Copy link
Contributor

@jshin96 jshin96 commented Feb 14, 2023

PR description:

This PR has three main modifications:

  1. Pileup Jet Id variable bug fix.

PileupJetIdProducer and PileupJetIdAlgo are used to compute input variables and the final discriminant for
the Pileup jet Id BDT. There was a bug for one of the variable calculation such that when there is no charged
constituent inside a jet, it assigns the very last constituent in the list of constituent as leading charged
constituent. This PR corrects this error and assigns zero pt and large phi and eta when there is no charged
constituent.

  1. Correct puppi weight access implementation by ValueMap.

Previously, the puppi weight for each constituent was accessed from packedCandidate object in the code, but for the
compatibility with other codes in CMSSW, this weight must be implemented through ValueMap. In this PR, puppi
weight retrieval from ValueMap is implemented, just like PR #40667. As in that PR, the naming of "puppi weight" is
generalized to "constituent weight". Furthermore, there are parts of PileupJetIdAlgo where comparison of constituent's pt
(in order to find, for example, leading charged constituent) was done incorrectly when weighted pt was compared with
unweighted pt. This error is also fixed.

The ValueMap implementation does not affect CHS jets. However, the fix in leading charged constituent selection could
have affected CHS jets. In further detail, CHS jets have more constituents in each jet in general and it is very rare
for CHS jets to have no charged constituent at all, so the effect of such fix was insignificant.

  1. Add an option in PuppiProducer to apply photon protection for existing weights.

Previously the photon protection is applied also on existing puppi weights. This will confuse users who would want
PuppiProducer to provide ValueMap of weights with the same exact weights stored in packedPFCandidates in MiniAODs.
This PR adds a flag (default set to False) to enable photon protection on existing puppi weights.

PR validation:

  • Validation plots comparing the input variables for Pileup Jet Id training before and after fix can be found in this JMAR meeting contribution.
  • Passes the usual runTheMatrix test: runTheMatrix.py -l limited -i all --ibeos. Test done by @nurfikri89.
  • Passes reMiniAOD and reNanoAOD workflows: runTheMatrix.py -i all --ibeos -l 1325.518,2500.312. Test done by @nurfikri89.
  • Passes the JMENano workflows: runTheMatrix.py -i all --ibeos -l 10224.15,11024.15,25202.15,11634.15. Test done by @nurfikri89.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40762/34181

  • This PR adds an extra 48KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40762/34190

ERROR: Build errors found during clang-tidy run.

RecoJets/JetProducers/plugins/PileupJetIdProducer.cc:72:1: error: version control conflict marker in file [clang-diagnostic-error]
<<<<<<< HEAD
^
Suppressed 2023 warnings (2018 in non-user code, 4 NOLINT, 1 with check filters).
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40762/34192

  • This PR adds an extra 48KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@swertz
Copy link
Contributor

swertz commented Feb 15, 2023

type jme

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a8a73f/30640/summary.html
COMMIT: 283d569
CMSSW: CMSSW_13_1_X_2023-02-14-2300/el8_amd64_gcc11
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/40762/30640/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 16 lines to the logs
  • Reco comparison results: 32 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3556272
  • DQMHistoTests: Total failures: 37
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3556213
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 213 log files, 164 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • You potentially added 4 lines to the logs
  • Reco comparison results: 12 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 10829
  • DQMHistoTests: Total failures: 39
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 10790
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 23 log files, 10 edm output root files, 11 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.31 2.233 2.232 0.000 ( +0.0% ) 9.59 9.44 +1.5% 1.479 1.521
2500.311 2.323 2.323 0.000 ( +0.0% ) 9.23 9.35 -1.2% 1.848 1.887
2500.312 2.277 2.277 -0.000 ( -0.0% ) 9.28 9.34 -0.6% 1.839 1.876
2500.33 1.099 1.100 -0.000 ( -0.0% ) 21.92 22.35 -1.9% 1.652 1.651
2500.331 1.394 1.394 0.000 ( +0.0% ) 16.09 16.24 -0.9% 1.799 1.797
2500.332 1.326 1.326 0.000 ( +0.0% ) 17.70 18.01 -1.7% 1.863 1.861
2500.401 2.139 2.139 0.000 ( +0.0% ) 10.35 10.59 -2.2% 1.167 1.212
2500.501 1.711 1.711 0.000 ( +0.0% ) 16.47 16.58 -0.7% 1.086 1.121
2500.511 1.124 1.124 0.000 ( +0.0% ) 30.44 30.76 -1.0% 1.363 1.370
2500.601 2.050 2.050 0.000 ( +0.0% ) 12.55 12.68 -1.0% 1.145 1.183

@swertz
Copy link
Contributor

swertz commented Feb 16, 2023

type jetmet

@swertz
Copy link
Contributor

swertz commented Feb 16, 2023

Thanks @jshin96 ,

About the bugfix you say:

In further detail, CHS jets have more constituents in each jet in general and it is very rare for CHS jets to have no charged constituent at all, so the effect of such fix was insignificant.

But looking at the comparison plots for Run2 samples (CHS), there are rather significant differences for puID for forward jets - which would make sense, since even for CHS these jets wouldn't have many charged consistuents, see eg:

image

Isn't this something we should worry about for current Run2 UL samples? Or would the effect on analysis be negligible because it would be the same in both data and MC?

@nurfikri89
Copy link
Contributor

nurfikri89 commented Feb 16, 2023

Hi @swertz,

(Let me answer on behalf of @jshin96 )

Isn't this something we should worry about for current Run2 UL samples? Or would the effect on analysis be negligible because it would be the same in both data and MC?

I would not worry about it because the trainings were derived with the buggy calculation in place and also when the discriminant is calculated. Also its the same for both data and MC, as you mentioned. In any case, if there is a re-nano or even a re-mini of Run-2 UL samples, we should definitely re-train (in the event we are still sticking to CHS jets). This is the plan anyway when we have to switch to Puppi jets for AK4 jets.

@swertz
Copy link
Contributor

swertz commented Feb 16, 2023

+1

Thanks for the clarification @nurfikri89 . @jshin96, can you please prepare a backport to 13_0_X?

@clacaputo
Copy link
Contributor

+reconstruction

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

  • Changes in Jet outputs have been approved by the reviewers

@cmsbuild cmsbuild merged commit 918a7fc into cms-sw:master Feb 21, 2023
cmsbuild added a commit that referenced this pull request Feb 21, 2023
[PileupJetId, Puppi] Backport of #40762 (Pileup ID input variable fix, puppi weight ValueMap access, optional photon protection for existing puppi weights) to CMSSW_13_0_X
@@ -491,7 +522,7 @@ PileupJetIdentifier PileupJetIdAlgo::computeIdVariables(const reco::Jet* jet,
double dZ0 = 9999.;
double dZ_tmp = 9999.;
for (unsigned vtx_i = 0; vtx_i < allvtx.size(); vtx_i++) {
const auto& iv = allvtx[vtx_i];
auto iv = allvtx[vtx_i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing the 12_6 backport with this PR, I notice that this updates was not backported.
While I agree about not backporting it, I wonder why it was even included in the master PR: should it get reverted at some point?

cmsbuild added a commit that referenced this pull request Jun 6, 2023
[PileupJetId, Puppi] Backport of #40762 (Pileup ID input variable fix, puppi weight ValueMap access, optional photon protection for existing puppi weights) to CMSSW_12_6_X
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants