-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring gen weight storage in EDM + Nano integration #32167
Closed
Closed
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
90a9105
First implementation of WeightGroupInfo/product classes
kdlong 3562b96
Working on parsing integration with scale/pdf weights
kdlong cc3ee9b
New WeightGroups, improved parsing with helper class
kdlong 6a3b0b5
updates to allow saving weight sums from all weight categories
sroychow 1f632fa
Add more error handling
56c56a2
adding altset index table
sroychow b6e854c
Use cms::Exception, configure debugging, code formatting
kdlong b7e851f
attempt to delete extra kets after the last </weightgroup>
SanghyunKo b6963fa
Step forward to newer cmssw version
kdlong 81e18ba
Allow gen products to run at GEN step or Nano
kdlong 0aae867
Support ignoregroups in nano, fixes for unassociated weights
kdlong ab4cae0
Convert OwnVector to unique_ptr
kdlong a880aa2
Simplify nano producer and weight parsing
kdlong 27811a9
Make producers edm::Global
kdlong 1751101
Code format, don't fail for missing LHEEventProduct
kdlong cecf5c6
Update nano and nanogen configs
kdlong 0fef54c
Fix scale weights in case of < 9 entries
kdlong 7d6888f
Code format
kdlong eab7023
Hopefully fixing test errors
kdlong 22f79fe
Attempt to fit cosmics workflow error
kdlong 562bac1
mask genweight addition in procmodifier
sroychow 11c68cf
macro to permit catch statement with LHAPDF
kdlong 5dee41c
Fix event content for gen weights in AOD
kdlong 61d39f0
Fix nanogen config
kdlong c29cb3a
Remove extraneous code in MEParamWeightGroup
kdlong File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
3 changes: 3 additions & 0 deletions
3
Configuration/ProcessModifiers/python/genWeightAddition_cff.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
import FWCore.ParameterSet.Config as cms | ||
|
||
genWeightAddition = cms.Modifier() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
#ifndef GeneratorInterface_Core_GenWeightHelper_h | ||
#define GeneratorInterface_Core_GenWeightHelper_h | ||
|
||
#include <tinyxml2.h> | ||
|
||
#include <fstream> | ||
#include <map> | ||
#include <regex> | ||
#include <string> | ||
#include <vector> | ||
|
||
#include "GeneratorInterface/Core/interface/WeightHelper.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/GenLumiInfoProduct.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/PartonShowerWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/PdfWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/ScaleWeightGroupInfo.h" | ||
|
||
namespace gen { | ||
class GenWeightHelper : public WeightHelper { | ||
public: | ||
GenWeightHelper(); | ||
std::vector<std::unique_ptr<gen::WeightGroupInfo>> parseWeightGroupsFromNames(std::vector<std::string> weightNames, | ||
bool addUnassociatedGroup) const; | ||
}; | ||
} // namespace gen | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#ifndef GeneratorInterface_Core_LHEWeightHelper_h | ||
#define GeneratorInterface_Core_LHEWeightHelper_h | ||
|
||
#include <tinyxml2.h> | ||
|
||
#include <fstream> | ||
#include <map> | ||
#include <regex> | ||
#include <string> | ||
#include <vector> | ||
|
||
#include "GeneratorInterface/Core/interface/WeightHelper.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/LHERunInfoProduct.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/MEParamWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/PdfWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/ScaleWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/UnknownWeightGroupInfo.h" | ||
|
||
namespace gen { | ||
class LHEWeightHelper : public WeightHelper { | ||
public: | ||
LHEWeightHelper() : WeightHelper(){}; | ||
|
||
enum class ErrorType { Empty, SwapHeader, HTMLStyle, NoWeightGroup, TrailingStr, Unknown, NoError }; | ||
const std::unordered_map<ErrorType, std::string> errorTypeAsString_ = { | ||
{ErrorType::Empty, "Empty header"}, | ||
{ErrorType::SwapHeader, "Header info out of order"}, | ||
{ErrorType::HTMLStyle, "Header is invalid HTML"}, | ||
{ErrorType::TrailingStr, "Header has extraneous info"}, | ||
{ErrorType::Unknown, "Unregonized error"}, | ||
{ErrorType::NoError, "No error here!"}}; | ||
|
||
std::vector<std::unique_ptr<gen::WeightGroupInfo>> parseWeights(std::vector<std::string> headerLines, | ||
bool addUnassociated) const; | ||
bool isConsistent(const std::string& fullHeader) const; | ||
void swapHeaders(std::vector<std::string>& headerLines) const; | ||
void setFailIfInvalidXML(bool value) { failIfInvalidXML_ = value; } | ||
bool failIfInvalidXML() const { return failIfInvalidXML_; } | ||
|
||
private: | ||
std::string weightgroupKet_ = "</weightgroup>"; | ||
std::string weightTag_ = "</weight>"; | ||
bool failIfInvalidXML_ = false; | ||
std::string parseGroupName(tinyxml2::XMLElement* el) const; | ||
ParsedWeight parseWeight(tinyxml2::XMLElement* inner, std::string groupName, int groupIndex, int& weightIndex) const; | ||
bool validateAndFixHeader(std::vector<std::string>& headerLines, tinyxml2::XMLDocument& xmlDoc) const; | ||
tinyxml2::XMLError tryReplaceHtmlStyle(tinyxml2::XMLDocument& xmlDoc, std::string& fullHeader) const; | ||
tinyxml2::XMLError tryRemoveTrailings(tinyxml2::XMLDocument& xmlDoc, std::string& fullHeader) const; | ||
ErrorType findErrorType(int xmlError, const std::string& headerLines) const; | ||
}; | ||
} // namespace gen | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
#ifndef GeneratorInterface_LHEInterface_WeightHelper_h | ||
#define GeneratorInterface_LHEInterface_WeightHelper_h | ||
|
||
#include <bits/stdc++.h> | ||
|
||
#include <boost/algorithm/string.hpp> | ||
#include <fstream> | ||
#include <memory> | ||
|
||
#include "LHAPDF/LHAPDF.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/GenWeightInfoProduct.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/GenWeightProduct.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/MEParamWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/PartonShowerWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/PdfWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/ScaleWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/UnknownWeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/WeightGroupInfo.h" | ||
#include "SimDataFormats/GeneratorProducts/interface/WeightsInfo.h" | ||
|
||
namespace gen { | ||
struct ParsedWeight { | ||
std::string id; | ||
int index; | ||
std::string groupname; | ||
std::string content; | ||
std::unordered_map<std::string, std::string> attributes; | ||
int wgtGroup_idx; | ||
}; | ||
|
||
class WeightHelper { | ||
public: | ||
WeightHelper(); | ||
|
||
template <typename T> | ||
std::unique_ptr<GenWeightProduct> weightProduct(const GenWeightInfoProduct& weightsInfo, | ||
std::vector<T> weights, | ||
float w0) const; | ||
|
||
void setGuessPSWeightIdx(bool guessPSWeightIdx) { | ||
PartonShowerWeightGroupInfo::setGuessPSWeightIdx(guessPSWeightIdx); | ||
} | ||
void addUnassociatedGroup(std::vector<std::unique_ptr<gen::WeightGroupInfo>>& weightGroups) const { | ||
gen::UnknownWeightGroupInfo unassoc("unassociated"); | ||
unassoc.setDescription("Weights with missing or invalid header meta data"); | ||
weightGroups.push_back(std::make_unique<gen::UnknownWeightGroupInfo>(unassoc)); | ||
} | ||
int addWeightToProduct(GenWeightProduct& product, double weight, std::string name, int weightNum, int groupIndex); | ||
void setDebug(bool value) { debug_ = value; } | ||
|
||
protected: | ||
bool debug_ = false; | ||
const unsigned int FIRST_PSWEIGHT_ENTRY = 2; | ||
const unsigned int DEFAULT_PSWEIGHT_LENGTH = 46; | ||
std::map<std::string, std::string> currWeightAttributeMap_; | ||
std::map<std::string, std::string> currGroupAttributeMap_; | ||
bool isScaleWeightGroup(const ParsedWeight& weight) const; | ||
bool isMEParamWeightGroup(const ParsedWeight& weight) const; | ||
bool isPdfWeightGroup(const ParsedWeight& weight) const; | ||
bool isPartonShowerWeightGroup(const ParsedWeight& weight) const; | ||
bool isOrphanPdfWeightGroup(ParsedWeight& weight) const; | ||
void updateScaleInfo(gen::ScaleWeightGroupInfo& scaleGroup, const ParsedWeight& weight) const; | ||
void updateMEParamInfo(const ParsedWeight& weight, int index) const; | ||
void updatePdfInfo(gen::PdfWeightGroupInfo& pdfGroup, const ParsedWeight& weight) const; | ||
void updatePartonShowerInfo(gen::PartonShowerWeightGroupInfo& psGroup, const ParsedWeight& weight) const; | ||
void cleanupOrphanCentralWeight(WeightGroupInfoContainer& weightGroups) const; | ||
bool splitPdfWeight(ParsedWeight& weight, WeightGroupInfoContainer& weightGroups) const; | ||
|
||
int lhapdfId(const ParsedWeight& weight, gen::PdfWeightGroupInfo& pdfGroup) const; | ||
std::string searchAttributes(const std::string& label, const ParsedWeight& weight) const; | ||
std::string searchAttributesByTag(const std::string& label, const ParsedWeight& weight) const; | ||
std::string searchAttributesByRegex(const std::string& label, const ParsedWeight& weight) const; | ||
|
||
// Possible names for the same thing | ||
const std::unordered_map<std::string, std::vector<std::string>> attributeNames_ = { | ||
{"muf", {"muF", "MUF", "muf", "facscfact"}}, | ||
{"mur", {"muR", "MUR", "mur", "renscfact"}}, | ||
{"pdf", {"PDF", "PDF set", "lhapdf", "pdf", "pdf set", "pdfset"}}, | ||
{"dyn", {"DYN_SCALE"}}, | ||
{"dyn_name", {"dyn_scale_choice"}}, | ||
{"up", {"_up", "Hi"}}, | ||
{"down", {"_dn", "Lo"}}, | ||
{"me_variation", {"mass", "sthw2", "width"}}, | ||
}; | ||
void printWeights(const WeightGroupInfoContainer& weightGroups) const; | ||
std::unique_ptr<WeightGroupInfo> buildGroup(ParsedWeight& weight) const; | ||
WeightGroupInfoContainer buildGroups(std::vector<ParsedWeight>& parsedWeights, bool addUnassociatedGroup) const; | ||
std::string searchString(const std::string& label, const std::string& name) const; | ||
}; | ||
|
||
template <typename T> | ||
std::unique_ptr<GenWeightProduct> WeightHelper::weightProduct(const GenWeightInfoProduct& weightsInfo, | ||
std::vector<T> weights, | ||
float w0) const { | ||
auto weightProduct = std::make_unique<GenWeightProduct>(w0); | ||
weightProduct->setNumWeightSets(weightsInfo.numberOfGroups()); | ||
gen::WeightGroupData groupData = {0, nullptr}; | ||
// size=1 happens if there are no PS weights, so the weights vector contains | ||
// only the central GEN weight. Size = 2 happens when Pythia produces a separate weight for the hadronization | ||
// In general this can also be handled by the "unassociated" group, but this avoids the requirement | ||
// that that setting always be true for workflows without the GenLumiInfoProduct (which can reasonably not exist | ||
// for special GEN workflows) | ||
if (!weightsInfo.numberOfGroups()) { | ||
if (weights.size() <= 2) | ||
return weightProduct; | ||
else | ||
throw cms::Exception("WeightHelper") | ||
<< "Found more than 2 weights in the event, but found no weight groups in the header."; | ||
} | ||
|
||
// This gets remade every event to avoid having state-dependence in the | ||
// helper class could think about doing caching instead | ||
int unassociatedIdx = weightsInfo.unassociatedIdx(); | ||
std::unique_ptr<gen::UnknownWeightGroupInfo> unassociatedGroup; | ||
if (unassociatedIdx != -1) | ||
unassociatedGroup = std::make_unique<gen::UnknownWeightGroupInfo>("unassociated"); | ||
int i = 0; | ||
for (const auto& weight : weights) { | ||
double wgtval; | ||
std::string wgtid; | ||
if constexpr (std::is_same<T, gen::WeightsInfo>::value) { | ||
wgtid = weight.id; | ||
wgtval = weight.wgt; | ||
} else if (std::is_same<T, double>::value) { | ||
wgtid = std::to_string(i); | ||
wgtval = weight; | ||
} | ||
try { | ||
groupData = weightsInfo.containingWeightGroupInfo(i, groupData.index); | ||
} catch (const cms::Exception& e) { | ||
if (unassociatedIdx == -1) | ||
throw e; | ||
if (debug_) { | ||
std::cout << "WARNING: " << e.what() << std::endl; | ||
} | ||
// Access the unassociated group separately so it can be modified | ||
unassociatedGroup->addContainedId(i, wgtid, wgtid); | ||
groupData = {static_cast<size_t>(unassociatedIdx), unassociatedGroup.get()}; | ||
} | ||
int entry = groupData.group->weightVectorEntry(wgtid, i); | ||
|
||
// TODO: is this too slow? | ||
if (debug_) | ||
std::cout << "Adding weight num " << i << " EntryNum " << entry << " to group " << groupData.index << std::endl; | ||
weightProduct->addWeight(wgtval, groupData.index, entry); | ||
i++; | ||
} | ||
return weightProduct; | ||
} | ||
} // namespace gen | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Unregonized -> Unrecognized