-
Notifications
You must be signed in to change notification settings - Fork 60
Installing, Compiling, Ntuples (RunII 2016,17, 18 datasets in CMSSW_10_2_X v1)
Table of contents
Installing requires installing FastJet, SFrame, and UHH2. This is now done via one script. It is assumed that a CMS environment is available on the machine.
❗ This branch requires a SL6 machine, and not an EL7 machine. e.g. naf-cms11.desy.de
or lxplus6.cern.ch
.
Before you begin: it is a good idea to create an empty directory, and in there run the following commands. This ensures it will not clash with any existing installations of FastJet, SFrame, & UHH2.
Also please don't skip installing FastJet & SFrame - fresh copies are needed, otherwise you will get issues linking & compiling things.
Download the installation script from GitHub and execute it:
wget https://raw.githubusercontent.com/UHH2/UHH2/RunII_102X_v1/scripts/install.sh
source install.sh
For csh users: use the install.csh
script instead
For zsh users: should work OK so long as you source install.sh
If this exceeds your quota, do
export CMSSW_GIT_REFERENCE=<DIRECTORY_WITH_ENOUGH_SPACE>/cmssw.git
and try again.
Alternatively, execute all the steps given in the install.sh
script one after the other. Sometimes, the compilation with cmsRun
fails. In this case, start a new installation in a clean shell.
Immediately after running the install script, assuming you are now in CMSSW_*/src/UHH2/
:
cmsenv
cd ../../../SFrame
source setup.sh
make -j4
cd ../CMSSW_*/src/UHH2
make -j9
NB: each time you log back in, you need to run both cmsenv
and source setup.sh
.
Before going on, it is important to realize that the UHH2 code has -- in general -- to be compiled twice: once for CMSSW and once for SFrame execution. This is because both packages have different dependencies and naming conventions; for example, the SFrame binaries will be placed in $SFRAME_LIB_DIR
, while the CMSSW binaries are in $CMSSW_BASE/lib/$SCRAM_ARCH
.
Note that SFrame can only be compiled after you "activated" your CMSSW release with cmsenv
- this is to get the correct ROOT, etc.
Usually, all you need to do is go to the UHH2
directory and type make
. This will build both the code for SFrame and for CMSSW. To only build the code for SFrame use, execute
make sframe
and to only build code for CMSSW, execute
make scram
This will actually run scram b
in the whole CMSSW installation and thus also compile other CMSSW packages (if this is not what you want only run make sframe
and run scram b
manually).
For the SFrame compilation, as default, only the directories from the UHH2/UHH2
repository are compiled (see UHH2/Makefile
for details). To enable compilation of additional analysis directories, create UHH2/Makefile.local
with the contents such as
dirs += MyAnalysis1 MyAnalysis2
This will trigger the build also in the directories named MyAnalysis1
and MyAnalysis2
. (The reason for using Makefile.local
is to avoid getting in each other's way: everyone can have their own Makefile.local
, which is ignored by git).
For cleaning up, use make clean
. Cleaning up manually can be done by removing all files in the obj
subdirectory (which is the location used by all auto-generated files) and the SFrame libraries, i.e. $SFRAME_DIR/lib/libSUHH2*
.
You can also compile the CMSSW part by executing
scram b
in the CMSSW directory yourself instead of using make scram
(the main purpose of the latter command is to prevent accidentally forgetting to build the CMSSW part by making the scram
the default).
For cleaning up the CMSSW build, run
scram b clean
(as is usual for CMSSW); note that make clean
only cleans up the SFrame compilation, not the CMSSW one.
This release is unique, in that it can handle multiple years' datasets. We now have unique ntuplewriter_<type>_<year>.py
config files for each year, with <type> = data
or mc
, which one can use directly. In this way, you can use the correct file in e.g. CRAB script, without needing to switch flags.
In order to facilitate this, we now use a generic function, generate_process(...)
in core/python/ntuple_generator.py
. The only mandatory argument is a year
argument.
The possible year
s are:
-
2016v2
: for 2016 MiniAODv2 / 03Feb2017 data -
2016v3
: for 2016 MiniAODv3 / rereco data -
2017v1
: for 2017 Prompt data & RunIIFall17MiniAOD MC -
2017v2
: for 2017 ReReco Data "31Mar18" & RunIIFall17MiniAODv2 MC -
2018
: for 2018 data & Autumn18 MC
You can then simply run each script: cmsRun ntuplewriter_<type>_<year>.py
Bonus
There are now some commandline arguments! See cmsRun ntuplewriter_<type>_<year>.py help
for all of them.
e.g.:
cmsRun ntuplewriter_xxx_yyy.py maxEvents=100 outputFile=testNtuple.root wantSummary=1
The jet collections are designed to be consistent across all years/datasets. A brief explainer of what is in each jet collection:
Name | jetsAk4CHS | jetsAk4Puppi | jetsAk8CHS | jetsAk8Puppi | jetsAk8CHSSubstructure_SoftDropCHS | jetsAk8PuppiSubstructure_SoftDropPuppi |
---|---|---|---|---|---|---|
Class type | Jet | Jet | Jet | Jet | TopJet | TopJet |
Clustering algorithm | Anti-kT | Anti-kT | Anti-kT | Anti-kT | Anti-kT | Anti-kT |
Cone size | 0.4 | 0.4 | 0.8 | 0.8 | 0.8 | 0.8 |
Pileup subtraction | CHS | PUPPI | CHS | PUPPI | CHS | PUPPI |
Has groomed subjets? | No | No | No | No | Yes (SoftDrop) | Yes (SoftDrop) |
Substructure/interesting variables | None | PUPPI multiplicities, DeepFlavour | DeepFlavour | PUPPI multiplicities, DeepFlavour | DeepFlavour, DeepBoostedJetTags (i.e. DeepJet), Nsubjettiness (tau_1,2,3,4, groomed & ungroomed), Energy correlation functions (N=2,3 * beta=1,2, groomed only) | PUPPI multiplicities, DeepFlavour, DeepBoostedJetTags (i.e. DeepJet), Nsubjettiness (tau_1,2,3,4, groomed & ungroomed), Energy correlation functions (N=2,3 * beta=1,2, groomed only) |
Other notes | Is just slimmedJets from MiniAOD | slimmedJetsPuppi from MiniAOD + extras | Reclustered with low pT threshold (10 GeV), especially for JERC studies | Reclustered with low pT threshold (10 GeV), especially for JERC studies | Reclustered AK8 CHS jets with groomed subjets, higher pT threshold (150 GeV). Main jet kinematics are ungroomed. Designed for boosted/high pT jet studies. | Reclustered AK8 PUPPI jets with groomed subjets, higher pT threshold (150 GeV). Main jet kinematics are ungroomed. Designed for boosted/high pT jet studies. |
NB DeepCSV, combinedSecondaryVertices, and combinedSecondaryVerticesMVA are only valid on CHS jets - BTV POG don't support PUPPI jets (as of last edit). For all jets we take those values from MiniAOD - these may or may not be sensible or valid. For 2016v2 datasets, we recalculate those values ourselves, since they were not in MiniAOD originally.
The following genjet collections are available across all years. Note that all collections are composed of final-state genparticles excluding neutrinos.
Name | slimmedGenJets | slimmedGenJetsAK8 | genjetsAk8Substructure | genjetsAk8SubstructureSoftDrop |
---|---|---|---|---|
Class type | GenJet |
GenJet |
GenTopJet |
GenTopJet |
Clustering algorithm | anti-kT | anti-kT | anti-kT | anti-kT |
Cone size | 0.4 | 0.8 | 0.8 | 0.8 |
pT cut | pT > 8 GeV | pT > 150 GeV. For lower pT studies, 3 jets are kept (with minimal information) from 30-100 GeV. | pT > 150 GeV | pT > 150 GeV |
Has groomed subjets? | No | No | No | Yes |
Substructure/interesting variables | N/A | N/A | Ungroomed Njettiness (tau_1,2,3,4) | Groomed Njettiness (tau_1,2,3,4), ECFs (N=2,3 * beta=1,2) |
Other notes | Kinematics are ungroomed | Kinematics are groomed. genparticles_indices are the constituents of the groomed fatjet (= sum over subjet constituents) |
-
Ntuple instructions per branch/release
- 10_6_X, UL16/17/18
- 10_2_X, 2016/17/18
- 9_4_X, 2017
- 9_2_X, 2017
-
8_0_X, 2016
- Installing and Compiling (Run II, 80X, miniAOD v1, 80X_v1)
- Ntuple Production (Run II, 80X, MiniAODv1)
- Installing, Compiling and Ntuples (Run II, 80X, miniAOD v2, 80X_v2)
- Installing, Compiling and Ntuples (Run II, 80X, Moriond17, 80X_v3)
- Installing, Compiling and Ntuples (Run II, 80X, miniAOD v2, HOTVR & XCone reprocessing, 80X_v5)
-
Analysis info
- crab kill, follow-up tasks, duplicates
- Running failing crab jobs locally
- Checking & Reprocessing of missing ntuples
- Creating & using luminosity ROOT file in SFrame
- Finding a MINIAOD file from an ntuple event
- Luminosity & cross-section weighting information for Monte Carlo samples
- NtupleFormat
- Pileup reweighting for MC
- 2017 MC samples with buggy pileup
- Recipe for PDF uncertainties (RunII, 25ns, MiniAODv2)
- Running
- Singularity (using SL6 on EL7)
- Storing user variables in objects
- Trigger Paths & Filters; storing trigger objects
- Working with DESY Tier 2 dCache (
/pnfs
) - Tier2 UHH2 group space
- Application of Keras Neural-Network in UHH2
-
Developer tips
- (Top) Jet collections in Ntuples
- Adding a new object class to ntuples
- CMSSW vs. SFrame
- Code Conventions
- Code Overview
- Committing & Contributing Code
- Compiling and installing fastjet, fastjet contrib
- Continuous Integration
- Continuous Integration Setup Instructions
- Debugging tips
- git(hub) tutorial
- Handling different years (RunII_102_v1 10_2_X and beyond)
- Event Class
- Maintainer Responsibilities
- Metadata
- OS Acronyms
- Performance
- Porting changes across branches (cherry-picking)
- Renaming a ntuple collection
- Using an external package
- DNN/TF dev planning
-
Older ntuple instructions
-
7_6_X, 25ns, 2015
- Installing and Compiling (Run II, 25ns)
- Installing and Compiling (Run II, 25ns, miniAOD v2)
- Installing and Compiling (Run II, 76X, 25ns, miniAOD v2)
- Ntuple Production (Run II, 25ns v1 MC ONLY!)
- Ntuple Production (Run II, 25ns, MiniAODv2)
- Ntuple Production (Run II, 25ns, prompt reco D v3)
- Ntuple Production (Run II, 76X, 25ns, MiniAODv2)
- 7_4_X, 50ns, 2015
- Phys14, 2014
-
7_6_X, 25ns, 2015