CMSDAS-bbbbAnalysis

Code and instructions for preparation and support of the CMS DAS HH -> bbbb analysis exercise

For further information on the exercise check the TWiki

Part of the information reported below targets the preparation of the inputs and not the development of the exercise itself. Refer to the TWiki for instructions on how to run the exercise.

Installing the repo

Install the bbbbAnalysis code following the instructions here, using the CMSDAS-exercise branch
git clone https://github.com/l-cadamuro/CMSDAS-bbbbAnalysis
NOTE: the exercise works independently on SL6 and SL7. Installation instructions on the reference TWiki use the CMSSW version recommended for Run 2 data analysis (SL7), which is different from the one of the bbbbAnalysis above

Preparing the input ntuples for the exercise

All the instructions before are controlled with the variable TAG in the scripts, remember to update it every time you rerun.

In the bbbbAnalysis package, launch all the special DAS skims using the FNAL batch (should take a short time)

## launch
source scripts/submitAllSkimsOnTier3ForDAS.sh
## check the output status
for d in `ls -d ${JOBSDIR}/*` ; do echo $d ; python scripts/getTaskStatus.py --dir ${d} ; done

Combine all the output files in a single one per sample

cmsenv
voms-proxy-init -voms cms
cd CMSDAS-bbbbAnalysis/support
source hadd_all_inputs.sh

Reprocess the ntuples to recompute the normalisation weight (it launches all the programs in background, should take a minute). Inside the script there is a command to check the result from the logs.

source prepare_all.sh

As a result of the procedure above, you will find ntuples for the exercise under /store/user/cmsdas/2020/long_exercises/DoubleHiggs/${TAG}/dasformat

Running the analysis

From the inputs, build the Higgs bosons and high level objects/variables for the MVA. A skeleton of code with I/O is prepare_inputs.cpp . A script to run on all the samples is :

cd analysis
source build_all.sh

Compute trigger scale factors

Trigger efficiencies are computed as described in AN-2016/268 (http://cms.cern.ch/iCMS/jsp/db_notes/noteInfo.jsp?cmsnoteid=CMS%20AN-2016/268) The trigger efficiency is evaluated by filter over a sample of events that passed all the previous filters. This removes correlation between filters. 2016 triggers are composed by the following filters (order matters!):

HLT_DoubleJet90_Double30_TripleBTagCSV_p087:
L1sTripleJetVBFIorHTTIorDoubleJetCIorSingleJet -> 1 object passing
QuadCentralJet30 -> 4 objects passing
DoubleCentralJet90 -> 2 objects passing
BTagCaloCSVp087Triple -> 3 objects passing
QuadPFCentralJetLooseID30 -> 4 objects passing
DoublePFCentralJetLooseID90 -> 2 objects passing

HLT_QuadJet45_TripleBTagCSV_p087:
L1sQuadJetC50IorQuadJetC60IorHTT280IorHTT300IorHTT320IorTripleJet846848VBFIorTripleJet887256VBFIorTripleJet927664VBF -> 1 object passing
QuadCentralJet45 -> 4 objects passing
BTagCaloCSVp087Triple -> 3 objects passing
QuadPFCentralJetLooseID45 -> 4 objects passing

Efficiency of each filter need to evaluated as a function of the variable on which they are cutting e.g. efficiency of QuadCentralJet45 vs pt of the forth highest pt jet Exception: since Scale factors are evaluated using ttbar events (2 real b jets only), BTagCaloCSVp087Triple efficiency must be evaluated by asking only 1 object passing the btag filter and as a function of the highest deepCSV jet. The efficiency of the 3 btag can the be evaluated as the probability that at least 3 of the four btags passes the cut

The total efficiency is eff(Double90Quad30 || Quad45) = eff(Double90Quad30) + eff(Quad45) - eff(Double90Quad30 && Quad45) where eff(Double90Quad30) is eff(L1)*eff(QuadCentralJet30)*eff(DoubleCentralJet90)*eff(BTagCaloCSVp087Triple)*eff(QuadPFCentralJetLooseID30)*eff(DoublePFCentralJetLooseID90). Same for eff(Quad45). eff(Double90Quad30 && Quad45) = eff(Double90Quad30) * eff_DJ(Quad45), where eff_DJ(Quad45) is the eff(Quad45) evaluated over a subset of events passing HLT_DoubleJet90_Double30_TripleBTagCSV_p087

Optional (already done for you): In the bbbbAnalysis package, launch all the trigger DAS skims (SingleMuon and MC TTbar datasets) using the FNAL batch

## launch
source source scripts/submitAllTriggerSkimsOnTier3ForDAS.sh 
## check the output status
for d in `ls -d ${JOBSDIR}/*` ; do echo $d ; python scripts/getTaskStatus.py --dir ${d} ; done

Optional (already done for you): Combine all the output files in a single one per sample (in the bbbbAnalysis package) (it should take 5-10 mins)

cmsenv
voms-proxy-init -voms cms
hadd -f SingleMuon_Data_forTrigger.root `xrdfsls -u /store/user/<your_username>/bbbb_ntuples_CMSDAS_trigger/ntuples_26Dic2019_v4/SKIM_SingleMuon_Data_forTrigger/output`
hadd -f TTbar_MC_forTrigger.root `xrdfsls -u /store/user/<your_username>/bbbb_ntuples_CMSDAS_trigger/ntuples_26Dic2019_v4/SKIM_MC_TT_TuneCUETP8M2T4_13TeV_forTrigger/output`

Produce trigger efficiency plots go to CMSDAS-bbbbAnalysis folder

root -l
.L trigger/TriggerEfficiencyByFilter.C+
ProduceAllTriggerEfficiencies("root://cmsxrootd.fnal.gov//store/user/cmsdas/2020/long_exercises/DoubleHiggs/triggerNtuples_30Dic2019/SingleMuon_Data_forTrigger.root","root://cmsxrootd.fnal.gov//store/user/cmsdas/2020/long_exercises/DoubleHiggs/triggerNtuples_30Dic2019/TTbar_MC_forTrigger.root","TriggerEfficiencies.root")

At this point you should have the trigger efficiencies for each filter for data and MC in the file TriggerEfficiencies.root

Fit the efficiencies (find a reasonable curve)
In the script which runs the skims for the analysis include the trigger efficiencies to calculate the trigger scale factor: a. add a branch in the skim that will contain the trigger scale factor b. calculate the trigger efficiency as eff(Double90Quad30 || Quad45) = eff(Double90Quad30) + eff(Quad45) - eff(Double90Quad30 && Quad45) both using data and MC efficiencies c. calculate the scale factor as eff_data(Double90Quad30 || Quad45) / eff_mc(Double90Quad30 || Quad45) and put the result in the branch

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
analysis		analysis
background		background
mldiscr		mldiscr
support		support
trigger		trigger
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMSDAS-bbbbAnalysis

Installing the repo

Preparing the input ntuples for the exercise

Running the analysis

Compute trigger scale factors

About

Releases

Packages

Contributors 2

Languages

l-cadamuro/CMSDAS-bbbbAnalysis

Folders and files

Latest commit

History

Repository files navigation

CMSDAS-bbbbAnalysis

Installing the repo

Preparing the input ntuples for the exercise

Running the analysis

Compute trigger scale factors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages