Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabled PCA inputs for deepTau v2 #27878

Merged

Conversation

swozniewski
Copy link
Contributor

PR description:

Problems have been detected with dxy_PCA input variables to DeepTauID which are inconsistent between data and MC. This PR sets input values to 0 and disables the variables this way.
Issue and fix were discussed in https://indico.cern.ch/event/842796/contributions/3541592/attachments/1897386/3130770/2019-08-26_DeepTau_ID_2017v2_DY_MC_vs_Embedded_and_bug_in_definition_of_PCA_variables.pdf (see slide 7 in particular)

Backport to 10_2_X for legacy analyses intended.

PR validation:

compiles

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27878/11650

  • This PR adds an extra 28KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27878/11652

  • This PR adds an extra 32KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @swozniewski for master.

It involves the following packages:

RecoTauTag/RecoTau

@perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 27, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2208/console Started: 2019/08/27 11:46

@perrotta
Copy link
Contributor

perrotta commented Sep 2, 2019

@swozniewski @mbluj, in view of the possible discussion at the ORP, could you please let us know which are the plans for this PR and its backports? I.e. are they planned for a given re-miniAOD campaign, or which is the urgency for having them ready for analyses?

Moreover, do you have updated plots as in page 2 of https://indico.cern.ch/event/842796/contributions/3541592/attachments/1897386/3130770/2019-08-26_DeepTau_ID_2017v2_DY_MC_vs_Embedded_and_bug_in_definition_of_PCA_variables.pdf made with the updated id as in this PR?

@peruzzim
Copy link
Contributor

peruzzim commented Sep 2, 2019

Let me comment here, yes, we will want this backported to 10_2 and 10_6 because it's the version that should replace v2 for Run2 data analysis. And yes, it will be included in a future MINIAOD run. We can discuss at the ORP how to handle this (I guess with the usual run2_miniAOD_devel), probably we will want to switch it on before UL18 starts.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2019

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2019

Comparison job queued.

@mbluj
Copy link
Contributor

mbluj commented Sep 2, 2019

@swozniewski @mbluj, in view of the possible discussion at the ORP, could you please let us know which are the plans for this PR and its backports? I.e. are they planned for a given re-miniAOD campaign, or which is the urgency for having them ready for analyses?

Backports of this PR are already prepared:

Moreover, do you have updated plots as in page 2 of https://indico.cern.ch/event/842796/contributions/3541592/attachments/1897386/3130770/2019-08-26_DeepTau_ID_2017v2_DY_MC_vs_Embedded_and_bug_in_definition_of_PCA_variables.pdf made with the updated id as in this PR?

To redo those plots a number of samples has to be processed. Such processing with 102X is well advanced, plots can be expected at the end of this week or beginning of next one.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 2, 2019

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-87bf54/2304/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 110 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2955688
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2955346
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.047 KiB( 33 files compared)
  • DQMHistoSizes: changed ( 1325.7 ): 0.047 KiB Physics/NanoAODDQM
  • Checked 145 log files, 15 edm output root files, 34 DQM output files

@perrotta
Copy link
Contributor

perrotta commented Sep 2, 2019

+1

  • From the point of view of the code this PR implements what announced in the PR description: the new deep tau id version v2p1 disables (i.e. sets to 0) the PCA inputs.
  • v2p1 replaces the previous v2 everywhere, and the change is propagated wherever needed for mini and nano AOD productions and DQM in cmssw
  • Jenkins tests show differences in the tau id in miniAOD, as expected
  • Some global comparison data/MC is announced to arrive in short, but the results shown in the presentation linked from the PR description suggest that by disabling the PCA inputs the ids move in the right direction

@peruzzim
Copy link
Contributor

peruzzim commented Sep 3, 2019

+1

@santocch
Copy link

santocch commented Sep 3, 2019

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 3, 2019

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

fabiocos commented Sep 3, 2019

+1

@cmsbuild cmsbuild merged commit 97029ee into cms-sw:master Sep 3, 2019
@mbluj
Copy link
Contributor

mbluj commented Sep 16, 2019

For records: a set of control plots obtained with the fixed DeepTau 2017v2p1, i.e. with PCA inputs disabled, can be found in https://indico.cern.ch/event/847915/contributions/3565600/attachments/1908653/3153065/TauPOG_DeepTauIDv2p1.pdf (for instance see sl. 4-5 and 12-14). The fix improves data/MC agreement as expected for whole Run-2, in particular ~30% disagreement for 2017 is solved.

@swozniewski swozniewski deleted the CMSSW_11_0_X_tau-pog_deepTauVetoPCA branch June 19, 2020 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants