Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rivet ParticleLevelProducer and HTXS multi-threading #30888

Merged
merged 9 commits into from
Sep 1, 2020
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,13 @@ class ParticleLevelProducer : public edm::one::EDProducer<edm::one::SharedResour
}

const edm::EDGetTokenT<edm::HepMCProduct> srcToken_;
const edm::ParameterSet pset_;

reco::Particle::Point genVertex_;

Rivet::RivetAnalysis* rivetAnalysis_;
Rivet::AnalysisHandler analysisHandler_;
Rivet::RivetAnalysis* rivetAnalysis_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if these were std::unique_ptr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Chris, thank you for your suggestions but I get

/build/mseidel/CMSSW_11_2_0_pre1/src/GeneratorInterface/RivetInterface/plugins/ParticleLevelProducer.cc:124:49: error: no matching function for call to 'Rivet::AnalysisHandler::addAnalysis(std::unique_ptr<Rivet::RivetAnalysis>&)'
     analysisHandler_->addAnalysis(rivetAnalysis_);
                                                 ^

std::unique_ptr<Rivet::AnalysisHandler> analysisHandler_;

};

#endif
19 changes: 7 additions & 12 deletions GeneratorInterface/RivetInterface/plugins/HTXSRivetProducer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,6 @@ class HTXSRivetProducer : public edm::one::EDProducer<edm::one::WatchRuns, edm::
: _hepmcCollection(consumes<HepMCProduct>(cfg.getParameter<edm::InputTag>("HepMCCollection"))),
_lheRunInfo(consumes<LHERunInfoProduct, edm::InRun>(cfg.getParameter<edm::InputTag>("LHERunInfo"))) {
usesResource("Rivet");

_HTXS = new Rivet::HiggsTemplateCrossSections();

_isFirstEvent = true;
_prodMode = cfg.getParameter<string>("ProductionMode");
m_HiggsProdMode = HTXS::UNKNOWN;

Expand All @@ -52,10 +48,9 @@ class HTXSRivetProducer : public edm::one::EDProducer<edm::one::WatchRuns, edm::
edm::EDGetTokenT<edm::HepMCProduct> _hepmcCollection;
edm::EDGetTokenT<LHERunInfoProduct> _lheRunInfo;

Rivet::AnalysisHandler _analysisHandler;
std::unique_ptr<Rivet::AnalysisHandler> _analysisHandler;
Rivet::HiggsTemplateCrossSections* _HTXS;

bool _isFirstEvent;
std::string _prodMode;
HTXS::HiggsProdMode m_HiggsProdMode;

Expand Down Expand Up @@ -108,13 +103,14 @@ void HTXSRivetProducer::produce(edm::Event& iEvent, const edm::EventSetup&) {
} else if (nBs == 2 && nHs == 1 && nZs == 0) {
m_HiggsProdMode = HTXS::BBH;
}

_HTXS->setHiggsProdMode(m_HiggsProdMode);
}
}

if (_isFirstEvent) {
_analysisHandler.addAnalysis(_HTXS);
if (!_HTXS || !_HTXS->hasProjection("FS")) {
delete _HTXS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be deleted. It will be deleted when _analysisHandler is re-assigned

_analysisHandler = std::unique_ptr<Rivet::AnalysisHandler>(new Rivet::AnalysisHandler());
_HTXS = new Rivet::HiggsTemplateCrossSections();
_analysisHandler->addAnalysis(_HTXS);

// set the production mode if not done already
if (_prodMode == "GGF")
Expand Down Expand Up @@ -152,8 +148,7 @@ void HTXSRivetProducer::produce(edm::Event& iEvent, const edm::EventSetup&) {
}

// initialize rivet analysis
_analysisHandler.init(*myGenEvent);
_isFirstEvent = false;
_analysisHandler->init(*myGenEvent);
}

// classify the event
Expand Down
18 changes: 12 additions & 6 deletions GeneratorInterface/RivetInterface/plugins/ParticleLevelProducer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,8 @@ using namespace Rivet;

ParticleLevelProducer::ParticleLevelProducer(const edm::ParameterSet& pset)
: srcToken_(consumes<edm::HepMCProduct>(pset.getParameter<edm::InputTag>("src"))),
rivetAnalysis_(new Rivet::RivetAnalysis(pset)) {
pset_(pset) {
usesResource("Rivet");

genVertex_ = reco::Particle::Point(0, 0, 0);

produces<reco::GenParticleCollection>("neutrinos");
Expand All @@ -31,9 +30,6 @@ ParticleLevelProducer::ParticleLevelProducer(const edm::ParameterSet& pset)
produces<reco::GenParticleCollection>("consts");
produces<reco::GenParticleCollection>("tags");
produces<reco::METCollection>("mets");

analysisHandler_.setIgnoreBeams(true);
analysisHandler_.addAnalysis(rivetAnalysis_);
}

void ParticleLevelProducer::addGenJet(Rivet::Jet jet,
Expand Down Expand Up @@ -116,7 +112,17 @@ void ParticleLevelProducer::produce(edm::Event& event, const edm::EventSetup& ev
event.getByToken(srcToken_, srcHandle);

const HepMC::GenEvent* genEvent = srcHandle->GetEvent();
analysisHandler_.analyze(*genEvent);

if (!rivetAnalysis_ || !rivetAnalysis_->hasProjection("FS")) {
delete rivetAnalysis_;
rivetAnalysis_ = new Rivet::RivetAnalysis(pset_);
analysisHandler_ = std::unique_ptr<Rivet::AnalysisHandler>(new Rivet::AnalysisHandler());

analysisHandler_->setIgnoreBeams(true);
analysisHandler_->addAnalysis(rivetAnalysis_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the ProjectionHandler::getInstance() is called in the constructor of ProjectionApplier, and Analysis inherits from ProjectionApplier, and our RivetAnalysis inherits from Analysis.

Does anything inside AnalysisHandler object (RivetAnalysis, Analysis, ProjectionApplier, or any helper class they use) rely that exactly the same ProjectionHandler object is used every time?

Or actually conversely, can it be "proven" that for a given AnalysisHandler object, it does not matter which ProjectionHandler object it uses (i.e. they could be cached in arbitrary way) as long as the same one is used for the whole duration of the AnalysisHandle::analyze() call? I.e. essentially the ProjectionApplier could be changed to replace ProjectionHandler member with always calling ProjectionHandler::getInstance()? (I'm not actually suggesting that, it's just an attempt to help to think about consequences)

I'm asking because we do not guarantee an instance of a edm::stream module to be run on any specific thread. A module instance could run on a different thread on each event than on the previous event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a further thought, I suspect this approach won't work in general with TBB. The approach would probably work if each instance of the edm::stream module would process the first event in a different thread wrt others because of this assignment in the constructor of ProjectionApplier
https://gitlab.com/hepcedar/rivet/-/blob/release-3-1-x/src/Core/ProjectionApplier.cc#L13

But we don't guarantee that. If two instances process the first event on the same thread, their RivetAnalysis objects will hold a reference to the same ProjectionHandler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Matti,

I think the ProjectionHandler objects are in principle interchangeable, they should just not be used at the same time.
If you have a patch to Rivet in mind I could try it out.

But we don't guarantee that. If two instances process the first event on the same thread, their RivetAnalysis objects will hold a reference to the same ProjectionHandler.

Is there any way to provoke that behavior for testing?

I modified this PR to re-init when the standard "FS" projection is not found. But I am not sure if that would solve the problem that you described.

Cheers,
Markus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Markus,

I think the ProjectionHandler objects are in principle interchangeable, they should just not be used at the same time.

That's good.

If you have a patch to Rivet in mind I could try it out.

I don't, but to me the solution the authors outlined in https://gitlab.com/hepcedar/rivet/-/issues/84, i.e. each AnalysisHandler having its own ProjectionHandler object would be the best.

We don't need AnalysisHandler to be thread-safe by itself, avoiding any global state (which now occurs through ProjectionHandler::getInstance() for it to be thread-friendly would be sufficient.

Is there any way to provoke that behavior for testing?

With a minimal configuration of just the ParticleLevelProducer and whatever modules it requires it could be possible to hack along

  • create a dummy module that is fast on EDM stream 0, and for other streams spins in CPU for a long time (e.g. seconds to tens of seconds)
  • put that module and the ParticleLevelProducer in a Path in the order of cms.Path(spinningModule, particleLevelProducer)
  • if ParticleLevelProducer requires any input, modules producing those can be either in a Task or in the same Path before spinningModule

Then it could happen that many or all of the ParticleLevelProducer stream instances get run in the same thread (because spinningModule is keeping the other threads busy).

I modified this PR to re-init when the standard "FS" projection is not found. But I am not sure if that would solve the problem that you described.

I suspect it will not solve that particular problem. Or at least I'd assume if two RivetAnalysis's are initialized with the same ProjectionHandler, the hasProjection("FS") would return true for both of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Matti,

I think the calls to getProjection (when a projection is applied) and hasProjection follow the same structure, see below.
Thus, the hasProjection should return false when the ProjectionHandler for a specific thread was not initialized (or initialized in another thread). In that case, the re-initialization should fix the problem.

Best,
Markus

This is the part of the code throwing the error:

  const Projection& ProjectionHandler::getProjection(const ProjectionApplier& parent, const string& name) const {
    MSG_TRACE("Searching for child projection '" << name << "' of " << &parent);
    NamedProjsMap::const_iterator nps = _namedprojs.find(&parent);
    if (nps == _namedprojs.end()) {
      std::ostringstream msg;
      msg << "No projections registered for parent " << &parent;
      throw Error(msg.str());
    }

called by ProjectionApplier/Analysis here:

    /// Get the named projection, specifying return type via a template argument.
    /// @todo Add SFINAE to require that PROJ inherit from Projection
    template <typename PROJ>
    const PROJ& getProjection(const std::string& name) const {
      const Projection& p = getProjHandler().getProjection(*this, name);
      return pcast<PROJ>(p);
    }

The hasProjection goes through very similar code:

  bool ProjectionHandler::hasProjection(const ProjectionApplier& parent, const string& name) const {
    MSG_TRACE("Searching for child projection '" << name << "' of " << &parent);
    NamedProjsMap::const_iterator nps = _namedprojs.find(&parent);
    if (nps == _namedprojs.end()) return false;
    NamedProjs::const_iterator np = nps->second.find(name);
    return !(np == nps->second.end());
  }

called by this function in ProjectionApplier/Analysis:

    /// Does this applier have a projection registered under the name @a name?
    bool hasProjection(const std::string& name) const {
      return getProjHandler().hasProjection(*this, name);
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the behavior breaks already with ProjectionApplier::getProjHandle()

    /// Get a reference to the ProjectionHandler for this thread.
    ProjectionHandler& getProjHandler() const {
      return _projhandler;
    }

where _projhandler is initialized in the constructor

  ProjectionApplier::ProjectionApplier()
    : _allowProjReg(true), _owned(false),
      _projhandler(ProjectionHandler::getInstance())
  {  }

Therefore, if two ProjectionApplier objects are created in the same thread, they get the same ProjectionHandler object. If these ProjectionApplier objects are later used in different threads, their getProjHandler() continue to return a reference to the same (shared) ProjectionHandler object, and the ProjectionHandler::hasProjection() will give the same answer to both (unless its internal state has already broken because of the data races).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, and if we patch Rivet to use gettimeofday microseconds instead of the thread id? That should give as a unique ProjectionHandler for each stream.

    /// Singleton creation function
    static std::mutex mtx;
    static ProjectionHandler& getInstance() {
      // static ProjectionHandler _instance;
      // return _instance;
      std::unique_lock<std::mutex> lock(mtx);
      static map<std::thread::id,ProjectionHandler> _instances;
      return _instances[std::this_thread::get_id()];
      
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would much rather see the "proper fix" suggested in https://gitlab.com/hepcedar/rivet/-/issues/84#note_341649903, i.e. each ProjectionApplier object just having its own PojectionHandler object. It is simpler and works correctly. If we go to patch Rivet, we could even contribute it to upstream and help to close that issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's see if that can be done, I hope there is no major roadblock for that.

Otherwise, we may just make this a one module again.

}

analysisHandler_->analyze(*genEvent);

// Convert into edm objects
// Prompt neutrinos
Expand Down