-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function to refine FastSim DeepJet discriminators #40553
Add function to refine FastSim DeepJet discriminators #40553
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40553/33785
|
A new Pull Request was created by @wolfmor for master. It involves the following packages:
@cmsbuild, @mandrenguyen, @clacaputo, @swertz, @vlimant can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
type btv |
please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19f146/30061/summary.html Comparison SummarySummary:
|
@wolfmor Would it be worth adding the driver commands you list in the PR description as a RelVal workflow? In particular, as you indicate that further developments are coming. |
@mandrenguyen we're working on exactly that |
…rom-CMSSW_13_0_X_2023-01-17-1100 add test workflow
@cms-sw/pdmv-l2 @cms-sw/upgrade-l2 please check and sign? workflow changes are hopefully straightforward, let me know if you have any concerns. |
+Upgrade |
+pdmv |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
Hi @srimanob yes, ideally this should be backported so that the refinement can be used in Run 3. My plan was once the backport to UL is merged #40828 (comment), I will do the backport to 12_4. This will probably take another week. |
Thanks @sbein |
Since the refinement runs at Nano level, for Run3 samples why can't you simply use the imminent NanoV12 campaigns in 13_0 that will take 12_4 MINI samples as input? FYI @simonepigazzini |
Hi @swertz https://cms-pdmv.cern.ch/mcm/campaigns?prepid=Run3*22*Nano*&page=-1&shown=16447 |
Hi @srimanob , so much is clear, I was also referring to Summer22 MC. Don't forget that current Summer22 and Nano v10/11 is not "complete" in the sense that many jet-related ingredients (PUPPI tune, taggers) were not updated yet. NanoV12 will be run on Summer22 MC and will contain all the recommended ingredients for analysis of Run3 data. I can see why you'd want some FastSim MC in 12_6/NanoV11 to be able to quickly implement Fast/Full comparisons with the existing samples, but just keep in mind that for physics results, in most cases you'll need to use NanoV12 anyway. Another point: this PR only implemented refinement for taggers in CHS jets (which makes sense for the Run2 UL backport), but Run3 samples contain PUPPI jets... |
PR description:
Requires: cms-data/PhysicsTools-NanoAOD#14
This PR adds a function that uses a regression neural network to refine the DeepJet discriminators of CHS jets in NanoAOD for FastSim to better match FullSim. The function can be called by including the option
--customise PhysicsTools/NanoAOD/jetsAK4_CHS_cff.nanoAOD_refineFastSim_bTagDeepFlav
in the cmsDriver command and requires the ONNX model added in the above mentioned PR to cms-data. The original values are copied to new variables named with the suffix "unrefined".Due to a bug in ONNX runtime 1.10.0 (see here) graph optimization has to be disabled to evaluate the model. The corresponding option is implemented in BaseMVAValueMapProducer for the ONNX backend.
The technique has been presented at the FastSim Days 2022 Workshop. There are plans to make this the default for FastSim in the future and possibly to extend to further collections/variables.
A complete set of commands to produce NanoAOD files with refined DeepJet discriminators is:
cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,VALIDATION:@standardValidation,DQM:@standardDQMFS -n 10 --conditions auto:run2_mc --beamspot Realistic25ns13TeV2016Collision --datatier GEN-SIM-DIGI-RECO,DQMIO --eventcontent FEVTDEBUGHLT,DQM --fast --era Run2_2016
cmsDriver.py step3 -s PAT --era Run2_2016 -n -1 --conditions auto:run2_mc --mc --datatier MINIAODSIM --eventcontent MINIAODSIM --filein file:TTbar_13TeV_TuneCUETP8M1_cfi_GEN_SIM_RECOBEFMIX_DIGI_L1_DIGI2RAW_L1Reco_RECO_VALIDATION_DQM.root --fast
cmsDriver.py --python_filename NanoAODrefined_cfg.py --eventcontent NANOAODSIM --fast --customise Configuration/DataProcessing/Utils.addMonitoring,PhysicsTools/NanoAOD/jetsAK4_CHS_cff.nanoAOD_refineFastSim_bTagDeepFlav --datatier NANOAODSIM --fileout file:step3_NANO.root --conditions auto:run2_mc --step NANO --filein "file:step3_PAT.root" --era run2_nanoAOD_106Xv2 --mc -n -1
PR validation:
The neural network has been trained on GEN-synchronized FastSim/FullSim jet pairs from SUSY simplified model T1tttt events and has been validated also in TTbar events. In both cases, considerably improved agreement with the FullSim output and an improvement in correlations among output observables and external parameters is seen.
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
Needs to be backported to 12_6.
@sbein @kpedro88