-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce SONIC core in CMSSW #29141
Changes from 7 commits
b31a330
6a8b17d
912d215
19f2752
534df5b
3d0ff72
f03f117
c67d250
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
<use name="FWCore/Framework"/> | ||
<use name="FWCore/Concurrency"/> | ||
<use name="FWCore/MessageLogger"/> | ||
<use name="FWCore/ParameterSet"/> | ||
<export> | ||
<lib name="1"/> | ||
</export> | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
# SONIC core infrastructure | ||
|
||
SONIC: Services for Optimized Network Inference on Coprocessors | ||
|
||
## For analyzers | ||
|
||
The `SonicEDProducer` class template extends the basic Stream producer module in CMSSW. | ||
|
||
To implement a concrete derived producer class, the following skeleton can be used: | ||
```cpp | ||
#include "HeterogeneousCore/SonicCore/interface/SonicEDProducer.h" | ||
#include "FWCore/Framework/interface/MakerMacros.h" | ||
|
||
class MyProducer : public SonicEDProducer<Client> | ||
{ | ||
public: | ||
explicit MyProducer(edm::ParameterSet const& cfg) : SonicEDProducer<Client>(cfg) { | ||
//for debugging | ||
setDebugName("MyProducer"); | ||
} | ||
void acquire(edm::Event const& iEvent, edm::EventSetup const& iSetup, Input& iInput) override { | ||
//convert event data to client input format | ||
} | ||
void produce(edm::Event& iEvent, edm::EventSetup const& iSetup, Output const& iOutput) override { | ||
//convert client output to event data format | ||
} | ||
static void fillDescriptions(edm::ConfigurationDescriptions & descriptions) { | ||
edm::ParameterSetDescription desc; | ||
Client::fillPSetDescription(desc); | ||
//add producer-specific parameters | ||
descriptions.add("MyProducer",desc); | ||
} | ||
}; | ||
|
||
DEFINE_FWK_MODULE(MyProducer); | ||
``` | ||
|
||
The generic `Client` must be replaced with a concrete client (see next section), which has specific input and output types. | ||
|
||
The python configuration for the producer should include a dedicated `PSet` for the client parameters: | ||
```python | ||
process.MyProducer = cms.EDProducer("MyProducer", | ||
Client = cms.PSet( | ||
# necessary client options go here | ||
) | ||
) | ||
``` | ||
These parameters can be prepopulated and validated by the client using `fillDescriptions` (see below). | ||
|
||
An example producer can be found in the [test](./test) folder. | ||
|
||
## For developers | ||
|
||
To add a new communication protocol for SONIC, follow these steps: | ||
1. Submit the communication protocol software and any new dependencies to [cmsdist](https://github.com/cms-sw/cmsdist) as externals | ||
2. Set up the concrete client(s) that use the communication protocol in a new package in the `HeterogeneousCore` subsystem | ||
3. Add a test producer (see above) to make sure it works | ||
|
||
To implement a concrete client, the following skeleton can be used for the `.h` file, with the function implementations in an associated `.cc` file: | ||
```cpp | ||
#ifndef HeterogeneousCore_MyPackage_MyClient | ||
#define HeterogeneousCore_MyPackage_MyClient | ||
|
||
#include "FWCore/ParameterSet/interface/ParameterSet.h" | ||
#include "HeterogeneousCore/SonicCore/interface/SonicClient*.h" | ||
|
||
class MyClient : public SonicClient*<Input,Output> { | ||
public: | ||
MyClient(const edm::ParameterSet& params); | ||
|
||
static void fillPSetDescription(edm::ParameterSetDescription& iDesc); | ||
|
||
protected: | ||
void evaluate() override; | ||
}; | ||
|
||
#endif | ||
``` | ||
|
||
The generic `SonicClient*` should be replaced with one of the available modes: | ||
* `SonicClientSync`: synchronous call, blocks until the result is returned. | ||
* `SonicClientAsync`: asynchronous, non-blocking call. | ||
* `SonicClientPseudoAsync`: turns a synchronous, blocking call into an asynchronous, non-blocking call, by waiting for the result in a separate `std::thread`. | ||
|
||
`SonicClientAsync` is the most efficient, but can only be used if asynchronous, non-blocking calls are supported by the communication protocol in use. | ||
|
||
In addition, as indicated, the input and output data types must be specified. | ||
(If both types are the same, only the input type needs to be specified.) | ||
|
||
In all cases, the implementation of `evaluate()` must call `finish()`. | ||
For the `Sync` and `PseudoAsync` modes, `finish()` should be called at the end of `evaluate()`. | ||
For the `Async` mode, `finish()` should be called inside the communication protocol callback function (implementations may vary). | ||
|
||
The client must also provide a static method `fillPSetDescription` to populate its parameters in the `fillDescriptions` for the producers that use the client: | ||
```cpp | ||
void MyClient::fillPSetDescription(edm::ParameterSetDescription& iDesc) { | ||
edm::ParameterSetDescription descClient; | ||
//add parameters | ||
iDesc.add<edm::ParameterSetDescription>("Client",descClient); | ||
} | ||
``` | ||
|
||
Example client code can be found in the `interface` and `src` directories of the other Sonic packages in this repository. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#ifndef HeterogeneousCore_SonicCore_SonicClientAsync | ||
#define HeterogeneousCore_SonicCore_SonicClientAsync | ||
|
||
#include "FWCore/Concurrency/interface/WaitingTaskWithArenaHolder.h" | ||
|
||
#include "HeterogeneousCore/SonicCore/interface/SonicClientBase.h" | ||
#include "HeterogeneousCore/SonicCore/interface/SonicClientTypes.h" | ||
|
||
template <typename InputT, typename OutputT = InputT> | ||
class SonicClientAsync : public SonicClientBase, public SonicClientTypes<InputT, OutputT> { | ||
public: | ||
virtual ~SonicClientAsync() {} | ||
|
||
//main operation | ||
void dispatch(edm::WaitingTaskWithArenaHolder holder) override final { | ||
holder_ = std::move(holder); | ||
setStartTime(); | ||
evaluate(); | ||
} | ||
}; | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
#ifndef HeterogeneousCore_SonicCore_SonicClientBase | ||
#define HeterogeneousCore_SonicCore_SonicClientBase | ||
|
||
#include "FWCore/Concurrency/interface/WaitingTaskWithArenaHolder.h" | ||
#include "FWCore/MessageLogger/interface/MessageLogger.h" | ||
|
||
#include <string> | ||
#include <chrono> | ||
#include <exception> | ||
|
||
class SonicClientBase { | ||
public: | ||
//destructor | ||
virtual ~SonicClientBase() = default; | ||
|
||
void setDebugName(const std::string& debugName) { debugName_ = debugName; } | ||
|
||
//main operation | ||
virtual void dispatch(edm::WaitingTaskWithArenaHolder holder) = 0; | ||
|
||
protected: | ||
virtual void evaluate() = 0; | ||
|
||
void setStartTime() { | ||
if (debugName_.empty()) | ||
return; | ||
t0_ = std::chrono::high_resolution_clock::now(); | ||
setTime_ = true; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nope, this can be simplified. |
||
} | ||
|
||
void finish(std::exception_ptr eptr = std::exception_ptr{}) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I understood the logic correctly, one thing that I find confusing (and thus do not like much) is that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is unfortunate but unavoidable. For a real-world example, see:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I understand. What about making the user call This would eliminate the confusion (and the specialisation in the dummy test). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That would be more consistent, but less user-friendly... to some extent I want to denote that true Async is a special case (some protocols don't support it, requires a callback, etc.). I'll think about it a bit more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've implemented this change now, and added a note about it to the README. |
||
if (setTime_) { | ||
auto t1 = std::chrono::high_resolution_clock::now(); | ||
edm::LogInfo(debugName_) << "Client time: " | ||
<< std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0_).count(); | ||
} | ||
holder_.doneWaiting(eptr); | ||
} | ||
|
||
//members | ||
edm::WaitingTaskWithArenaHolder holder_; | ||
|
||
//for logging/debugging | ||
std::string debugName_; | ||
std::chrono::time_point<std::chrono::high_resolution_clock> t0_; | ||
bool setTime_ = false; | ||
}; | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,76 @@ | ||||||
#ifndef HeterogeneousCore_SonicCore_SonicClientPseudoAsync | ||||||
#define HeterogeneousCore_SonicCore_SonicClientPseudoAsync | ||||||
|
||||||
#include "FWCore/Concurrency/interface/WaitingTaskWithArenaHolder.h" | ||||||
#include "HeterogeneousCore/SonicCore/interface/SonicClientBase.h" | ||||||
#include "HeterogeneousCore/SonicCore/interface/SonicClientTypes.h" | ||||||
|
||||||
#include <memory> | ||||||
#include <condition_variable> | ||||||
#include <mutex> | ||||||
#include <thread> | ||||||
#include <atomic> | ||||||
#include <exception> | ||||||
|
||||||
//pretend to be async + non-blocking by waiting for blocking calls to return in separate std::thread | ||||||
template <typename InputT, typename OutputT = InputT> | ||||||
class SonicClientPseudoAsync : public SonicClientBase, public SonicClientTypes<InputT, OutputT> { | ||||||
public: | ||||||
//constructor | ||||||
SonicClientPseudoAsync() : SonicClientBase(), SonicClientTypes<InputT, OutputT>(), hasCall_(false), stop_(false) { | ||||||
thread_ = std::make_unique<std::thread>([this]() { waitForNext(); }); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A thread per module x stream is fine for now, but I created an issue (#29188) for a more general mechanism in case more use cases would surface. |
||||||
} | ||||||
//destructor | ||||||
virtual ~SonicClientPseudoAsync() { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? Or does our There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right, but this PR is passing code checks so |
||||||
{ | ||||||
std::lock_guard<std::mutex> guard(mutex_); | ||||||
stop_ = true; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At some point long ago, I was encountering a data race, so I moved every read and write inside a lock. This one shouldn't need it, though, so I'll remove it. |
||||||
} | ||||||
cond_.notify_one(); | ||||||
if (thread_) { | ||||||
thread_->join(); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line makes the destructor to possibly throw (practical implication being the process calling an Given that these objects have the life time of EDModules, we are not really interested to stop processing because of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean not calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant try-catch (sorry for being imprecise). |
||||||
thread_.reset(); | ||||||
} | ||||||
} | ||||||
//accessor | ||||||
void dispatch(edm::WaitingTaskWithArenaHolder holder) override final { | ||||||
//do all read/writes inside lock to ensure cache synchronization | ||||||
{ | ||||||
std::lock_guard<std::mutex> guard(mutex_); | ||||||
holder_ = std::move(holder); | ||||||
setStartTime(); | ||||||
|
||||||
//activate thread to wait for response, and return | ||||||
hasCall_ = true; | ||||||
} | ||||||
cond_.notify_one(); | ||||||
} | ||||||
|
||||||
protected: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? |
||||||
void waitForNext() { | ||||||
while (true) { | ||||||
//wait for condition | ||||||
{ | ||||||
std::unique_lock<std::mutex> lk(mutex_); | ||||||
cond_.wait(lk, [this]() { return (hasCall_ or stop_); }); | ||||||
if (stop_) | ||||||
break; | ||||||
|
||||||
//do everything inside lock | ||||||
evaluate(); | ||||||
|
||||||
//reset condition | ||||||
hasCall_ = false; | ||||||
} | ||||||
} | ||||||
} | ||||||
|
||||||
//members | ||||||
bool hasCall_; | ||||||
std::mutex mutex_; | ||||||
std::condition_variable cond_; | ||||||
std::atomic<bool> stop_; | ||||||
std::unique_ptr<std::thread> thread_; | ||||||
}; | ||||||
|
||||||
#endif |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,25 @@ | ||||||
#ifndef HeterogeneousCore_SonicCore_SonicClientSync | ||||||
#define HeterogeneousCore_SonicCore_SonicClientSync | ||||||
|
||||||
#include "FWCore/Concurrency/interface/WaitingTaskWithArenaHolder.h" | ||||||
|
||||||
#include "HeterogeneousCore/SonicCore/interface/SonicClientBase.h" | ||||||
#include "HeterogeneousCore/SonicCore/interface/SonicClientTypes.h" | ||||||
|
||||||
#include <exception> | ||||||
|
||||||
template <typename InputT, typename OutputT = InputT> | ||||||
class SonicClientSync : public SonicClientBase, public SonicClientTypes<InputT, OutputT> { | ||||||
public: | ||||||
virtual ~SonicClientSync() {} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
or just remove? |
||||||
|
||||||
//main operation | ||||||
void dispatch(edm::WaitingTaskWithArenaHolder holder) override final { | ||||||
holder_ = std::move(holder); | ||||||
setStartTime(); | ||||||
|
||||||
evaluate(); | ||||||
} | ||||||
}; | ||||||
|
||||||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#ifndef HeterogeneousCore_SonicCore_SonicClientTypes | ||
#define HeterogeneousCore_SonicCore_SonicClientTypes | ||
|
||
#include "FWCore/ParameterSet/interface/ParameterSet.h" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Strictly speaking this |
||
|
||
//this base class exists to limit the impact of dependent scope in derived classes | ||
template <typename InputT, typename OutputT = InputT> | ||
class SonicClientTypes { | ||
public: | ||
//typedefs for outside accessibility | ||
typedef InputT Input; | ||
typedef OutputT Output; | ||
//destructor | ||
virtual ~SonicClientTypes() = default; | ||
|
||
//accessors | ||
Input& input() { return input_; } | ||
const Output& output() const { return output_; } | ||
|
||
protected: | ||
Input input_; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand that these cannot be pointers because this class acts kind of like a buffer between the possibly asynchronous back-end, and the user code that produces the input (and could go out of scope before it is sent to the backend) and consumes the output (that it may not have allocated space for yet). Is it correct ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's correct. With the interface of |
||
Output output_; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to note that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the future, we can consider if something like |
||
}; | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
#ifndef HeterogeneousCore_SonicCore_SonicEDProducer | ||
#define HeterogeneousCore_SonicCore_SonicEDProducer | ||
|
||
#include "FWCore/Framework/interface/Event.h" | ||
#include "FWCore/Framework/interface/stream/EDProducer.h" | ||
#include "FWCore/ParameterSet/interface/ParameterSet.h" | ||
#include "FWCore/Framework/interface/EventSetup.h" | ||
#include "FWCore/Concurrency/interface/WaitingTaskWithArenaHolder.h" | ||
#include "FWCore/MessageLogger/interface/MessageLogger.h" | ||
|
||
#include <string> | ||
#include <chrono> | ||
|
||
//this is a stream producer because client operations are not multithread-safe in general | ||
//it is designed such that the user never has to interact with the client or the acquire() callback directly | ||
template <typename Client, typename... Capabilities> | ||
class SonicEDProducer : public edm::stream::EDProducer<edm::ExternalWork, Capabilities...> { | ||
public: | ||
//typedefs to simplify usage | ||
typedef typename Client::Input Input; | ||
typedef typename Client::Output Output; | ||
//constructor | ||
SonicEDProducer(edm::ParameterSet const& cfg) : client_(cfg.getParameter<edm::ParameterSet>("Client")) {} | ||
//destructor | ||
virtual ~SonicEDProducer() = default; | ||
|
||
//derived classes use a dedicated acquire() interface that incorporates client_.input() | ||
//(no need to interact with callback holder) | ||
void acquire(edm::Event const& iEvent, | ||
edm::EventSetup const& iSetup, | ||
edm::WaitingTaskWithArenaHolder holder) override final { | ||
auto t0 = std::chrono::high_resolution_clock::now(); | ||
acquire(iEvent, iSetup, client_.input()); | ||
auto t1 = std::chrono::high_resolution_clock::now(); | ||
if (!debugName_.empty()) | ||
edm::LogInfo(debugName_) << "Load time: " | ||
<< std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count(); | ||
client_.dispatch(holder); | ||
} | ||
virtual void acquire(edm::Event const& iEvent, edm::EventSetup const& iSetup, Input& iInput) = 0; | ||
//derived classes use a dedicated produce() interface that incorporates client_.output() | ||
void produce(edm::Event& iEvent, edm::EventSetup const& iSetup) override final { | ||
//todo: measure time between acquire and produce | ||
produce(iEvent, iSetup, client_.output()); | ||
} | ||
virtual void produce(edm::Event& iEvent, edm::EventSetup const& iSetup, Output const& iOutput) = 0; | ||
|
||
protected: | ||
//for debugging | ||
void setDebugName(const std::string& debugName) { | ||
debugName_ = debugName; | ||
client_.setDebugName(debugName); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The same string is stored twice (ok, they may be COW). Maybe the |
||
} | ||
//members | ||
Client client_; | ||
std::string debugName_; | ||
}; | ||
|
||
#endif |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<environment> | ||
<bin file="TestIntegration.cpp" name="TestIntegrationSonic"> | ||
<flags TEST_RUNNER_ARGS=" /bin/bash HeterogeneousCore/SonicCore/test run_sonic_test.sh"/> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The test script can also be invoked with something along <test name="TestHeterogeneousCoreSonicCore" command="run_sonic_test.sh"/> IIUC in that case the script is executed from the Continuing with |
||
<use name="FWCore/Utilities"/> | ||
</bin> | ||
<library file="SonicDummyProducer.cc"> | ||
<flags EDM_PLUGIN="1"/> | ||
<use name="HeterogeneousCore/SonicCore"/> | ||
<use name="DataFormats/TestObjects"/> | ||
</library> | ||
</environment> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just leave out because both base classes declare virtual destructors?