Skip to content

Normalizer

Dylan Teague edited this page Jan 3, 2017 · 2 revisions

Normalizer

This class is what normalizes the input histograms to the appropriate luminosity. The brunt of the algorithm is taken from the the ROOT hadd function only modified to scale the input histograms to the appropriate cross-section and lumi.

Each normalizer object only holds one output histogram, so they are used for grouping like samples such as adding all W+Jet samples into one file

Normalization

The normalization is done in the normal way. Given an input cross-section, luminosity, we are the total number of events for a sample should be:

Total = cross-section * Luminosity

To achieve this scale, we need to first normalize the graphs (or scale by total number of events for graphs that aren't 1-to-1 with events).

SF_noCuts = cross-section * Luminosity / Total # Events

This only works if we have no cuts in our graphs, else we need to incorporate the fact that only a faction of events passed the cuts, so we get:

SF = SF_noCuts * (# Passing Events / Total # Events)
   = cross-section * Luminosity / Total # Events

In case of a skim efficiency (which is a scale factor to adjust the real total number of events in Monte Carlos), we multiple our scale factor by the skim efficiency to get the final result.

SF = cross-section * Luminosity * Skim Efficiency / Total # Events

Use variable

The use variable is simple but not in an immediately obvious manner. For every input file, the file is checked to see if it exists or not, then it checks if the output file exists or not. If it does, then it checks the time of last modification of the output and input file. The reason to give the program intelligence in normalization so files that have already been normalized aren't renormalized every run.

  • If the file doesn't exist, the program should abort
  • If input file is newer than the output file, then the output should be normalized
  • If the input file is older than the output file, then the other inputs should be checked. If all inputs are older, this normalization is skipped

Each state is given a value (0, 1, and 2 respectively), becoming a small value based on importance. This means we can take the minimum of the use values to get the final state of the file (abort, normalize, skip normalization).

Constructors, Destructors, & Assignment Operators

  Normer();
  Normer(vector<string>);
  Normer(const Normer& other);
  Normer& operator=(const Normer& rhs);
  ~Normer();

Public Functions

Setters and Getters

void setLumi(double);
   //Set luminosity variable
void setValues(vector<string>);
   //Takes input values and sets information for Plotter (used in Constructor)
void setUse();
   //If the output was not going to be made (ie use==2), set it to 1 so the graph
   //is renormalized

Others

  int shouldAdd(string infile, string globalFile);

Function sets the use value. It follows the convention of the use value laid out here. The convention shortly is Use Value | State | Outcome 0 | File doesn't exist | Exit Program 1 | Infile is new than the outfile or outfile doesn't exist | Normalize File 2 | Infile is older than the outfile | Already Normalized, does nothing


  int getModTime(const char* path);

Helper function for shouldAdd. It takes a file and outputs the time the file was last modified.


  void MergeRootfile( TDirectory*);

Main portion of the Normalize class. This takes all of the input files, goes to each histogram in each folder, scales them based on the normalization factor, and adds them together to make the output file. The algorithm this function uses is recursive and based on the ROOT hadd function.


void print();

Simple little function that prints out the information of the normalization. If the file was normalized, it prints off all of the constituent files that were added to make the output file

Public Values

vector<string> input;
vector<double> skim;
vector<double> xsec;
vector<double> SF;
string output;
string type="";
double lumi;
TList* FileList;
vector<double> normFactor;
bool isData=false;
int use=3;
  • input is the vector of the input files to be added together
  • skim is a vector that holds skim efficiencies for the respective input files
  • xsec is a vector that holds cross-sections for the respective input files
  • SF is a vector that holds scale factors for the respective input files
  • output is the name of the output root file
  • type is the type of output file this one will be. The options are data, signal, and background (data, sig, bg respectively)
  • lumi is the luminosity used in normalization
  • FileList is the actual root files corresponding to the names in input
  • normFactor is a vector that has the total number of events in each file (unweighted) which is used for normalization
  • isData is a convenient variable to quickly finding out if the output is data (so not to normalize the files)
  • use is a variable that gives information on if the file exists, is usable, and if the output needs to be normalized. More info here