Skip to content

Recipe for PDF uncertainties (RunII, 25ns, MiniAODv2)

Robin edited this page Sep 29, 2018 · 20 revisions

In RunII, the different PDF sets and their uncertainty variations are now stored directly in MINIAOD as weights. This makes applying the variations and calculating the uncertainty much easier. Additionally, the mu_R and mu_F (renormalisation and factorisation scales, respectively) are also stored as weights.

In our UHH2 code, we store these in Event::systweights().

Applying PDF uncertainties

For applying PDF systematic uncertainties to your analysis, you should follow the recipe below. In general, there are 100 systematic weights addressing this uncertainty stored for some, not all, samples in the genInfo tree. Each histogram that is used for your final limit calculation needs to be filled an additional 100 times, once for each systematic weight. This needs to be implemented in your code by yourself since every analysis uses different histograms for limit calculation.

1. Read the systematic weights: The systematic weight itself can be read by event.get->systweights().at(X). Please refer to the table below to know which values must be used for X. This might be different for each sample you are using.

2. Apply the right weight: This systematic weight has to be normalized to the central event weight at LHE level. There are two different weights to be used for this normalization, depending on which sample you are using. Please also refer to the table below to know, which weight you should use. In general, the resulting weight should be centered around 1. The two possible factors you need to multiply the weight you typically use for filling with are: event.genInfo->systweights().at(X) / event.genInfo->pdf_scalePDF() or event.genInfo->systweights().at(X) / event.genInfo->event.genInfo->originalXWGTUP(), e.g.

fillweight_PDF = event.weight * event.genInfo->systweights().at(X) / event.genInfo->pdf_scalePDF()

or

fillweight_PDF = event.weight * event.genInfo->systweights().at(X) / event.genInfo->originalXWGTUP()

3. Fill your histograms: As described above, you need to fill each histogram used for limit calculation 100 times, once for each systematic PDF-weight stored in event.get->systweights(). Please keep in mind that also other weights are stored here, so please use the table below to get the right weights.

4. Calculate the RMS in each bin: To obtain the final uncertainty arising from these PDF variations, you have to calculate the RMS of these 100 histograms for each bin separately. For a conservative estimate, the first idea is to take the nominal PDF set as the mean value for RMS calculation. The variation of the nominal PDF set to be used in the limit calculation then is nominal+RMS and nominal-RMS for each bin, respectively.

Finding the systematic PDF weights

To access the header, in which a list of systematic weights is stored, one has to first uncomment lines 1016-1030 in core/plugins/NtupleWriter.cc and secondly process one event locally. Besides other things, a list of which systematic weights are stored in which entry of event.get->systweights() for the processed sample is displayed. Keep in mind that not all generators are giving these systweights, so for some samples this vector might even be empty. Please use the table below to find the first and the last entry in the systweights vector corresponding to PDF uncertainties (C++ numbering) and/or enter them for samples that have not yet been checked.

More information

More info is given in this presentation by Josh Bendavid (Slides 10 to 16):

https://indico.cern.ch/event/459797/contributions/1961581/attachments/1181555/1800214/mcaod-Feb15-2016.pdf

Short sample name First entry Last entry Weight for normalization
MC_DYJetsToLL inclusive 9 109 event.genInfo->originalXWGTUP()
MC_STpos_tW_inc not filled
MC_STneg_tW_inc not filled
MC_ST_t-channel_4f_leptonDecays 9 109 event.genInfo->originalXWGTUP()
MC_ST_s-channel_4f_leptonDecays 9 109 event.genInfo->originalXWGTUP()
MC_TTbar 9 109 event.genInfo->originalXWGTUP()
MC_TT_Mtt0700to1000
MC_TT_Mtt1000toINFT
MC_TTbarScaleUp
MC_TTbarScaleDown
MC_WJetsToLNu 9 109 event.genInfo->originalXWGTUP()
WW,WZ,ZZ not filled
MC_QCD_MuEnriched not filled
MC_QCD_EMEnriched
MC_QCD_bcToE
MC_QCD_HT
MC_WJets_LNu_HT 9 109
MC_DYJetsToLL_M50_HT 9 109
MC_TpB_TH_LH_M1000 112 212 event.genInfo->pdf_scalePDF()

Example use:

const auto & sys_weights = event.genInfo->systweights();
float orig_weight = event.genInfo->pdf_scalePDF();
int MY_FIRST_INDEX = 9;
for (unsigned i=0; i < 100; ++i) {
    my_hist.Fill(my_value, event.weight * sys_weights[i+MY_FIRST_INDEX]/orig_weight);
}

Note on Hessian vs MC sets

Some PDF sets, e.g. PDF4LHC, come with 2 forms of the uncertainties: Hessian, and Monte Carlo. The name of the PDF set is only somewhat helpful in denoting this: if it has _mc in it (e.g. PDF4LHC15_nlo_mc_pdfas) it uses MC uncertainties, otherwise it uses Hessian uncertainties.

This difference is important when calculating the final uncertainty from all the uncertainty variations (see Section 6.2 of the PDF4LHC recommendations for LHC Run II paper below). The main difference is that MC variation has to be divided by sqrt(N - 1) where N is the number of variation sets (e.g. 100).

Hessian: Compute uncertainty on observable from each hessian variation and add in quadrature

MC Replicas: Make a distribution of the observable under the (eg 100) MC replicas and either take the RMS as the uncertainty or propagate the full distribution for non-gaussian cases

Note on PDFWeights class

UHH2 has a PDFWeights class that, instead of using the stored weights in MINIAOD, directly calls LHAPDF using the parton momentum fractions. This may be useful when trying to use a PDF set not stored in MINIAOD, or for validating the weights.

Note on potentially “buggy” samples

Please note that certain MC samples, especially those whose LHE was made in 2015 (MG2.2.2), do not store `originalXWGTUP` correctly if the hard process has >=10 particles, due to a bug in how MG writes out the event. You can easily spot this: `originalXWGTUP()` should have the exact same value as `sys_weights[0]` - if it does not, use `sysWeights[0]` in lieu of `originalXWGTUP` in all the above formulae.

Note that samples made in 2016 & newer may use this 2015 LHE, so you should check carefully.

Reference:

https://hypernews.cern.ch/HyperNews/CMS/get/generators/4109/1/1/1/1.html

https://hypernews.cern.ch/HyperNews/CMS/get/generators/2648.html

Helpful links

PDF4LHC recommendations:

Clone this wiki locally