Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of DeepMET #29764

Merged
merged 6 commits into from
May 19, 2020
Merged

First version of DeepMET #29764

merged 6 commits into from
May 19, 2020

Conversation

steggema
Copy link
Contributor

@steggema steggema commented May 7, 2020

PR description:

This PR introduces a producer for DeepMET, a deep-learning-based missing pT estimator. The producer creates a new MET collection, and a test configuration is included. The plan is to create a separate PR for possible inclusion in central sequences, e.g. for the upcoming ReMiniAOD campaign.

The tensorflow models for this PR are proposed in a separate PR to cms-data/RecoMET-METPUSubtraction cms-data/RecoMET-METPUSubtraction#5

There are different trainings for different years/conditions (2016, 2018, phase 2), and for 2018 and phase 2 also non-response-corrected trainings.

Presentations:
https://indico.cern.ch/event/912067/contributions/3835851/ (most recent update)
https://indico.cern.ch/event/883809/contributions/3733818/ (CMS week JetMET meeting)
https://indico.cern.ch/event/854654/contributions/3594579/ (first presentation in MET meeting)

Note that an alternative implementation would be to store additional weights for each PFCandidate and calculate MET and possibly jets in a subsequent step. However, we prefer to leave this option to future studies given that we have not checked the performance on jets (and assume some non-trivial effects) and that this would lead to an increase in complexity of the integration and the additional storage required, in particular if we want to have different METs (e.g. response and non-response-corrected) for a single campaigns.

PR validation:

The code has been validated by running it in large-scale checks with simulated and data events to evaluate the performance of the algorithm. A test configuration is included. We have not run any memory or timing tests but we suspect that it runs fast and consumes little memory given the models are small compared to most other tensorflow models.

CPU and memory reports can be found under the following links, obtained on 1000 events from the 136.8311_RunJetHT2017F workflow:
https://steggema.web.cern.ch/steggema/cgi-bin/igprof-navigator/deepmet_cpu_reminiaod
https://steggema.web.cern.ch/steggema/cgi-bin/igprof-navigator/deepmet_mem_reminiaod (note that DeepMET does not seem to appear here, supposedly because it's not in the top 1000, see the text file linked below)

Text dumps are also available here: /afs/cern.ch/user/s/steggema/public/DeepMETIntegration/

if this PR is a backport please specify the original PR and why you need to backport that PR:

@intrepid42 @yongbinfeng

@cmsbuild
Copy link
Contributor

cmsbuild commented May 7, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 7, 2020

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29764/15211

  • This PR adds an extra 12KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@slava77
Copy link
Contributor

slava77 commented May 7, 2020

please provide some estimates of CPU time and memory use.

Other than just enabling the code in mini/nanoAOD, what else needs to happen (more developments, retraining)?
If nothing else, I think it would be better to add this now at least to the miniAOD step.

@steggema
Copy link
Contributor Author

steggema commented May 8, 2020

please provide some estimates of CPU time and memory use.

Will be added.

Other than just enabling the code in mini/nanoAOD, what else needs to happen (more developments, retraining)?
If nothing else, I think it would be better to add this now at least to the miniAOD step.

We don't need and foresee any additional developments or training at the moment (except for uncertainty estimates, which are however a completely different matter both content- and code-wise), so we'll add this to the MiniAOD step if you prefer to have this included in this PR.

Given other commitments, I expect an update by sometime next week.

@slava77
Copy link
Contributor

slava77 commented May 8, 2020

Given other commitments, I expect an update by sometime next week.

OK
Please note

Code check has found code style and quality issues which could be resolved by applying following patch(s)

the PR tests can not proceed without this addressed

@cmsbuild
Copy link
Contributor

cmsbuild commented May 11, 2020

The tests are being triggered in jenkins.
Tested with other pull request(s) cms-data/RecoMET-METPUSubtraction#5
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/6220/console Started: 2020/05/11 15:31

@cmsbuild
Copy link
Contributor

+1
Tested at: fc76ca5
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d3fcda/6220/summary.html
CMSSW: CMSSW_11_1_X_2020-05-11-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

+1
Tested at: 74e15da
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f1bf30/6391/summary.html
CMSSW: CMSSW_11_1_X_2020-05-18-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f1bf30/6391/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 50 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2694466
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2694414
  • DQMHistoTests: Total skipped: 49
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 150 log files, 16 edm output root files, 35 DQM output files

@slava77
Copy link
Contributor

slava77 commented May 18, 2020

Still, considering that the implementation really does not return anything more than px,py, would it be more practical to insert the DeepMETs in slimmedMETsPuppi or even slimmedMETs in the same way how Calo and CHS METs are done?

I had in mind this

maybeReadShifts(iConfig, "caloMET", pat::MET::Calo);
maybeReadShifts(iConfig, "chsMET", pat::MET::Chs);
maybeReadShifts(iConfig, "trkMET", pat::MET::Trk);

I've updated using this solution. Note that the PATMETSlimmer is apparently also always re-run in NanoAOD, so I switched adding the DeepMET variants off by default and only add them now in regular MiniAOD processing after a quick exchange with @peruzzim . Adding the DeepMET px and py values to NanoAOD will require a bit of fiddling with the according sequence in Nano.

@ahinzmann @lathomas please clarify if this works OK for JME.
Thank you.

@ahinzmann
Copy link
Contributor

Still, considering that the implementation really does not return anything more than px,py, would it be more practical to insert the DeepMETs in slimmedMETsPuppi or even slimmedMETs in the same way how Calo and CHS METs are done?

I had in mind this

maybeReadShifts(iConfig, "caloMET", pat::MET::Calo);
maybeReadShifts(iConfig, "chsMET", pat::MET::Chs);
maybeReadShifts(iConfig, "trkMET", pat::MET::Trk);

I've updated using this solution. Note that the PATMETSlimmer is apparently also always re-run in NanoAOD, so I switched adding the DeepMET variants off by default and only add them now in regular MiniAOD processing after a quick exchange with @peruzzim . Adding the DeepMET px and py values to NanoAOD will require a bit of fiddling with the according sequence in Nano.

@ahinzmann @lathomas please clarify if this works OK for JME.
Thank you.

Yes, that's fine. As long as the fiddling with the sequence for JME-extended-Nano is doable based on this PR for MiniAOD that's fine.

@slava77
Copy link
Contributor

slava77 commented May 18, 2020

+1

for #29764 74e15da

  • code changes are in line with the PR description and the follow up review
  • jenkins tests pass and comparisons with the baseline show differences only in the miniAOD workflows in the "corrections" of the slimmedMETs and slimmedMETsPuppi (similar to the way of storing trk, chs, and calo METs)

@silviodonato
Copy link
Contributor

merge
@santocch

@santocch
Copy link

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants