cms-cat · ttedeschi · Jan 11, 2024 · Nov 21, 2023 · Nov 22, 2023 · Nov 22, 2023
diff --git a/README.md b/README.md
@@ -1 +1,117 @@
-# nanoAOD-tools-modules
+# Modules for `nanoAOD-tools`
+
+This repo provides some modules that can be used with the
+[`nanoAOD-tools`](https://github.com/cms-nanoAOD/nanoAOD-tools)
+framework for post-processing CMS's nanoAOD files.
+
+The modules add new branches like lepton scale factors, using the
+[`correctionlib`](https://github.com/cms-nanoAOD/correctionlib)
+tool, and official
+[JSON correction files](https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration)
+provided by the CMS POGs.
+
+
+## Installation
+
+### Install CMSSW
+First, install your favorite CMSSW release.
+If you rely on `correctionlib` and `python3`,
+it is strongly recommend to use CMSSW 11.3 or newer.
+For example,
+<table>
+<tr>
+<td>
+CMSSW 11.3.4 (for <a href="http://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/">combine v9</a>)
+</td>
+<td>
+CMSSW 12.4.8
+</td>
+</tr>
+<tr>
+<td>
+
+```bash
+export SCRAM_ARCH=slc7_amd64_gcc900
+cmsrel CMSSW_11_3_4
+cd CMSSW_11_3_4/src
+cmsenv
+```
+</td>
+<td>
+
+```bash
+export SCRAM_ARCH=slc7_amd64_gcc10
+cmsrel CMSSW_12_4_8
+cd CMSSW_12_4_8/src/
+cmsenv
+```
+</td>
+</tr>
+</table>
+
+
+### Install `nanoAOD-tools` (for CMSSW 13.2 or older)
+Install [`nanoAOD-tools`](https://github.com/cms-nanoAOD/nanoAOD-tools) to process nanoAOD files.
+Note that starting from CMSSW 13.3, a basic version of `nanoAOD-tools` is included.
+To install the standalone version, please do
+```bash
+cd $CMSSW_BASE/src/
+git clone https://github.com/cms-nanoAOD/nanoAOD-tools.git PhysicsTools/NanoAODTools
+```
+
+### Install `NATModules`
+Now install this repository as `PhysicsTools/NATModules` and compile (build) everything:
+```bash
+cd $CMSSW_BASE/src/
+cmsenv
+git clone [email protected]:cms-cat/nanoAOD-tools-modules.git PhysicsTools/NATModules
+scram b
+```
+
+### Install `correctionlib` (for CMSSW 11.2 or older)
+Starting from CMSSW 11.3,
+[`correctionlib`](https://github.com/cms-nanoAOD/correctionlib)
+should come pre-installed.
+To install yourself for older CMSSW versions,
+please see have a look at [documentation](https://cms-nanoaod.github.io/correctionlib/).
+Note that `correctionlib` works best for `python3`.
+
+### Install `correctionlib` data
+Finally, install the `correctionlib` JSON files provided by CMS
+into `PhysicsTools/NATModules/data` from the
+[`cms-nanoAOD/jsonpog-integration` repository](https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration)
+on GitLab:
+```bash
+cd $CMSSW_BASE/src/PhysicsTools/NATModules
+git clone ssh://[email protected]:7999/cms-nanoAOD/jsonpog-integration.git data
+```
+or clone via Kerberos, where `$USER` is your CERN lxplus name:
+```bash
+kinit [email protected]
+git clone https://$USER:@gitlab.cern.ch:8443/cms-nanoAOD/jsonpog-integration.git
+```
+Alternatively, this repository is regularly synchronized to `/cvmfs/`,
+so if your system has access, you can copy the latest version
+```bash
+cd $CMSSW_BASE/src/PhysicsTools/NATModules
+cp -r /cvmfs/cms.cern.ch/rsync/cms-nanoAOD/jsonpog-integration data
+```
+
+
+### Test the installation
+Run a module, for example
+```bash
+cd $CMSSW_BASE/src/PhysicsTools/NATModules
+python3 ./test/example_muonSF.py -i root://cms-xrd-global.cern.ch//store/mc/RunIISummer20UL16NanoAODv9/DYJetsToLL_M-50_TuneCP5_13TeV-amcatnloFXFX-pythia8/NANOAODSIM/20UL16JMENano_106X_mcRun2_asymptotic_v17-v1/2820000/11061525-9BB6-F441-9C12-4489135219B7.root
+```
+
+
+## Usage
+
+To use in your own analysis, you can use the standalone scripts in [`test`](test/) as an example.
+If you compiled this package correctly, you can import the modules in
+[`PhysicsTools/NATModules/python/modules`](PhysicsTools/NATModules/python/modules)
+as
+```python
+from PhysicsTools.NATModules.modules.muonSF import *
+```
diff --git a/python/modules/electronSF.py b/python/modules/electronSF.py
@@ -0,0 +1,80 @@
+###
+# Compute muon SFs using correctionlib, and store in new branch.
+# Load as
+#  eleSF = electronSF("POG/EGM/2016postVFP_UL/electron.json.gz")
+#  eleSF.addCorrection("NUM_TrackerMuons_DEN_genTracks", "2016postVFP", "sf")
+#  eleSF.addCorrection("NUM_MediumID_DEN_TrackerMuons", "2016postVFP", "sfdown", "sfsysdn")
+#  eleSF.addCorrection("NUM_MediumID_DEN_TrackerMuons", "2016postVFP", "sfup", "sfsysup")
+###
+from __future__ import print_function
+from PhysicsTools.NanoAODTools.postprocessing.framework.eventloop import Module
+from correctionlib import CorrectionSet
+
+class ElectronSF(Module):
+    def __init__(self, json, collection="Electron"):
+        """Electron SF correction module.
+        Parameters:
+            json: the correction file
+            collection: name of the collection to be corrected
+        Use addCorrection() to set which factors should be added.
+        """
+        self.collection = collection
+        self.names = [ ]
+        self.scenarios = [ ]
+        self.wps = [ ]
+        self.valtypes = [ ]
+        self.varnames = [ ]
+        self.evaluators = [ ]
+        self.evaluator = CorrectionSet.from_file(json)
+
+    def addCorrection(self, name, scenario, wp, valtype, varname=None):
+        """
+        Call this method to add a correction factor.
+        Parameters:
+            name: name of the corrections, e.g. 'UL-Electron-ID-SF'
+            scenario: year/scenario, e.g. '2016postVFP'
+            wp: working point
+                - string, e.g. 'Loose', 'Medium', 'wp80iso', ...
+                - function of pT, e.g. lambda pt: 'RecoAbove20' if pt>=20 else 'RecoBelow20'
+            valtype: type of factor, e.g. 'sf', 'sfup', 'sfdown', ...
+            varname: branch name suffix (defaults to {wp}_{valtype})
+        """
+        if varname==None: # default suffix
+          assert isinstance(wp,str), "Please use the varname option to name the branch"
+          varname = wp+'_'+valtype
+        self.names.append(name)
+        self.scenarios.append(scenario)
+        self.valtypes.append(valtype)
+        self.wps.append(wp)
+        self.varnames.append(f"{self.collection}_{varname}") # branch name
+        self.evaluators.append(self.evaluator[name])
+
+    def beginFile(self, inputFile, outputFile, inputTree, wrappedOutputTree):
+        """Add branch for every correction to output file."""
+        self.out = wrappedOutputTree        
+        for varname in set(self.varnames): # avoid duplicates
+            self.out.branch(varname, 'F', lenVar='nElectron')
+
+    def analyze(self, event):
+        pts = [max(10.001,event.Electron_pt[i]) for i in range(event.nElectron)]
+        etas = [event.Electron_eta[i] for i in range(event.nElectron)]
+        for ic in range(len(self.evaluators)):
+            # We cannot make a single call to evaluate passing eta, pt as arrays
+            # since POG JSONS are currently provided with flow="error", so we
+            # have to loop to protect for values out of binning range.
+            sfs = [1.]*event.nElectron
+            for iEle in range(event.nElectron):
+                try:
+                  if isinstance(self.wps[ic],str): # WP is a simple string
+                      wp = self.wps[ic]
+                  else: # assume WP is a function of pT
+                      wp = self.wps[ic](pts[iEle]) # evaluate WP in pT
+                      if wp==None: # evaluate correction only if WP is defined
+                         continue
+                  sfs[iEle] = self.evaluators[ic].evaluate(
+                      self.scenarios[ic], self.valtypes[ic], wp, etas[iEle], pts[iEle])
+                except:
+                    print(f"ElectronSF.analyze: Exception for {self.scenarios[ic]}, {self.valtypes[ic]}, wp={self.wps[ic]}, eta={etas[iEle]:6.4f}, pt={pts[iEle]:6.4f}")
+                    pass # default sf = 1
+            self.out.fillBranch(self.varnames[ic], sfs)
+        return True
diff --git a/python/modules/muonSF.py b/python/modules/muonSF.py
@@ -0,0 +1,68 @@
+###
+# Compute muon SFs using correctionlib, and store in new branch.
+# Load as
+#  muSF = muonSF("POG/MUO/2016postVFP_UL/muon_Z.json.gz")
+#  muSF.addCorrection("NUM_TrackerMuons_DEN_genTracks", "2016postVFP_UL", "sf")
+#  muSF.addCorrection("NUM_MediumID_DEN_TrackerMuons", "2016postVFP_UL", "systdown", "sfsysdn")
+#  muSF.addCorrection("NUM_MediumID_DEN_TrackerMuons", "2016postVFP_UL", "systup", "sfsysup")
+###
+from __future__ import print_function
+from PhysicsTools.NanoAODTools.postprocessing.framework.eventloop import Module
+from correctionlib import CorrectionSet
+
+class MuonSF(Module):
+    def __init__(self, json, collection="Muon"):
+        """Muon SF correction module.
+        Parameters:
+            json: the correction file
+            collection: name of the collection to be corrected
+        Use addCorrection() to set which factors should be added.
+        """
+        self.collection = collection
+        self.names = [ ]
+        self.scenarios = [ ]
+        self.valtypes = [ ]
+        self.varnames = [ ]
+        self.evaluators = [ ]
+        self.evaluator = CorrectionSet.from_file(json)
+
+    def addCorrection(self, name, scenario, valtype, varname=None):
+        """
+        Call this method to add a correction factor.
+        Parameters:
+            name: name of the corrections, e.g. 'NUM_TrackerMuons_DEN_genTracks'
+            scenario: year/scenario, e.g. '2016postVFP_UL'
+            valtype: type of factor, e.g. 'sf', 'systup', ...
+            varname: branch name suffix (defaults to valtype)
+        """
+        if varname==None: # default suffix
+          varname = valtype
+        self.names.append(name)
+        self.scenarios.append(scenario)
+        self.valtypes.append(valtype)
+        self.varnames.append(f"{self.collection}_{varname}") # branch name
+        self.evaluators.append(self.evaluator[name])
+
+    def beginFile(self, inputFile, outputFile, inputTree, wrappedOutputTree):
+        """Add branch for every correction to output file."""
+        self.out = wrappedOutputTree        
+        for varname in self.varnames:
+            self.out.branch(varname, 'F', lenVar='nMuon')
+
+    def analyze(self, event):
+        pts  = [max(15.001,event.Muon_pt[i]) for i in range(event.nMuon)]
+        etas = [min(2.39999,abs(event.Muon_eta[i])) for i in range(event.nMuon)]
+        for ic in range(len(self.evaluators)):
+            # We cannot make a single call to evaluate passing eta, pt as arrays
+            # since POG JSONS are currently provided with flow="error", so we
+            # have to loop to protect for values out of binning range.
+            sfs = [1.]*event.nMuon
+            for iMu in range(event.nMuon):
+                try:
+                    sfs[iMu] = self.evaluators[ic].evaluate(self.scenarios[ic], etas[iMu], pts[iMu], self.valtypes[ic])
+                except:
+                    print(f"MuonSF.analyze: Exception for {self.scenarios[ic]}, eta={etas[iMu]:6.4f}, pt={pts[iMu]:6.4f}, {self.valtypes[ic]}")
+                    pass # default sf = 1
+            self.out.fillBranch(self.varnames[ic], sfs)
+        return True
+
diff --git a/test/example_electronSF.py b/test/example_electronSF.py
@@ -0,0 +1,50 @@
+#!/usr/bin/env python3
+###
+# This is an example of configuring and calling a generic correctionlib module.
+# Notes:
+# - For the sake of speed, it is important to apply a preselection whenever possible.
+# - Also, it is advisable to insert correction modules like this after modules
+#   that apply selections, so that corrections are not computed for events that
+#   are then discarded.
+# - Find all available corrections and parameters by running in the command
+#     correction summary POG/EGM/2016postVFP_UL/electron.json.gz
+###
+from PhysicsTools.NanoAODTools.postprocessing.framework.postprocessor import PostProcessor
+from PhysicsTools.NATModules.modules.electronSF import *
+
+# Set up the muon correction module
+eleSF = ElectronSF("data/POG/EGM/2016postVFP_UL/electron.json.gz")
+set = 'UL-Electron-ID-SF'
+era = '2016postVFP'
+
+# Add electron reconstruction scale factor
+reco = lambda pt: 'RecoAbove20' if pt>=20 else 'RecoBelow20'
+eleSF.addCorrection(set, era, reco, 'sf', 'Reco_sf')
+eleSF.addCorrection(set, era, reco, 'sfdown', 'Reco_sfdn')
+eleSF.addCorrection(set, era, reco, 'sfup', 'Reco_sfup')
+
+# Add electron isolation scale factor
+eleSF.addCorrection(set, era, 'Medium', 'sf')
+eleSF.addCorrection(set, era, 'Medium', 'sfdown', 'sfdn')
+eleSF.addCorrection(set, era, 'Medium', 'sfup', 'sfup')
+
+# Add electron identification scale factor
+eleSF.addCorrection(set, era, 'Medium', 'sf')
+eleSF.addCorrection(set, era, 'Medium', 'sfdown', 'sfdn')
+eleSF.addCorrection(set, era, 'Medium', 'sfup', 'sfup')
+
+# Settings for post-processor
+from argparse import ArgumentParser
+parser = ArgumentParser(description="Process nanoAOD files and add branches",epilog="Good luck!")
+parser.add_argument('-i', '--infiles', nargs='+')
+parser.add_argument('-m', '--maxevts', type=int, default=10000) # limit number of events (per file) for testing
+args = parser.parse_args()
+branchsel = None #"keepElectron.txt" # keep only Electron branches for speed
+fnames = args.infiles or [
+  "root://cms-xrd-global.cern.ch//store/mc/RunIISummer20UL16NanoAODv9/DYJetsToLL_M-50_TuneCP5_13TeV-amcatnloFXFX-pythia8/NANOAODSIM/20UL16JMENano_106X_mcRun2_asymptotic_v17-v1/2820000/11061525-9BB6-F441-9C12-4489135219B7.root"
+]
+
+# Process nanoAOD file
+p = PostProcessor(".", fnames, "nElectron>=2 && Electron_pt>18", branchsel, [eleSF], maxEntries=args.maxevts, postfix="_ElectronSFs",
+                  outputbranchsel=branchsel, provenance=True, prefetch=True, longTermCache=True)
+p.run()
diff --git a/test/example_muonSF.py b/test/example_muonSF.py
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+###
+# This is an example of configuring and calling a generic correctionlib module.
+# Notes:
+# - For the sake of speed, it is important to apply a preselection whenever possible.
+# - Also, it is advisable to insert correction modules like this after modules
+#   that apply selections, so that corrections are not computed for events that
+#   are then discarded.
+# - Find all available corrections and parameters by running in the command
+#     correction summary POG/MUO/2016postVFP_UL/muon_Z.json.gz
+###
+from PhysicsTools.NanoAODTools.postprocessing.framework.postprocessor import PostProcessor
+from PhysicsTools.NATModules.modules.muonSF import *
+
+# Set up the muon correction module
+muSF = MuonSF("data/POG/MUO/2016postVFP_UL/muon_Z.json.gz")
+era = '2016postVFP_UL'
+
+# Add TrackerMuon reconstruction scale factor
+muSF.addCorrection('NUM_TrackerMuons_DEN_genTracks', era, 'sf')
+# Add Medium ID scale factor
+muSF.addCorrection('NUM_MediumID_DEN_TrackerMuons', era, 'sf')
+# Add Medium ID scale factor, down variation
+muSF.addCorrection('NUM_MediumID_DEN_TrackerMuons', era, 'systdown', 'sfsysdn')
+# Add Medium ID scale factor, up variation
+muSF.addCorrection('NUM_MediumID_DEN_TrackerMuons', era, 'systup', 'sfsysup')
+
+# Settings for post-processor
+from argparse import ArgumentParser
+parser = ArgumentParser(description="Process nanoAOD files and add branches",epilog="Good luck!")
+parser.add_argument('-i', '--infiles', nargs='+')
+parser.add_argument('-m', '--maxevts', type=int, default=10000) # limit number of events (per file) for testing
+args = parser.parse_args()
+branchsel = None #"keepElectron.txt" # keep only Electron branches for speed
+fnames = args.infiles or [
+  "root://cms-xrd-global.cern.ch//store/mc/RunIISummer20UL16NanoAODv9/DYJetsToLL_M-50_TuneCP5_13TeV-amcatnloFXFX-pythia8/NANOAODSIM/20UL16JMENano_106X_mcRun2_asymptotic_v17-v1/2820000/11061525-9BB6-F441-9C12-4489135219B7.root"
+]
+
+# Process nanoAOD file
+p = PostProcessor(".", fnames, "nMuon>=2 && Muon_pt>18", branchsel, [muSF], maxEntries=args.maxevts, postfix="_MuonSFs",
+                  outputbranchsel=branchsel, provenance=True, prefetch=True, longTermCache=True)
+p.run()