-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add {Copy,Move}ToDeviceCache<T> class templates and moveToDeviceAsync function template #43969
Conversation
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43969/38872
|
A new Pull Request was created by @makortel (Matti Kortelainen) for master. It involves the following packages:
@fwyzard, @makortel, @cmsbuild can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
enable gpu |
@cmsbuild, please test |
1 similar comment
@cmsbuild, please test |
@fwyzard Any thoughts? |
@@ -182,6 +182,24 @@ namespace ALPAKA_ACCELERATOR_NAMESPACE { | |||
output[i] = {x, input[i].y(), input[i].z(), input[i].id(), input[i].flags(), input[i].m()}; | |||
} | |||
} | |||
|
|||
template <typename TAcc, typename = std::enable_if_t<alpaka::isAccelerator<TAcc>>> | |||
ALPAKA_FN_ACC void operator()(TAcc const& acc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(independently of this PR)
For kernels that are defined in ALPAKA_ACCELERATOR_NAMESPACE
, I came to wonder the necessity or usefulness of defining these (member) functions as templates over TAcc
. The kernel itself tends to have some kind of assumption on the dimensions, and the caller uses one of the Acc<N>D
explicitly with alpaka::exec()
. In this particular case the argument could be directly
ALPAKA_FN_ACC void operator()(TAcc const& acc, | |
ALPAKA_FN_ACC void operator()(Acc1D const& acc, |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-53fbcd/37467/summary.html Comparison SummarySummary:
GPU Comparison SummarySummary:
|
Co-authored-by: Andrea Bocci <[email protected]>
f45870a
to
25fe7a6
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43969/43065
|
@cmsbuild, please test |
+heterogeneous |
This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @mandrenguyen, @sextonkennedy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 Size: This PR adds an extra 32KB to repository Comparison SummarySummary:
GPU Comparison SummarySummary:
|
+1 |
PR description:
Resolves cms-sw/framework-team#790
CopyToDeviceCache<T>
The primary purpose of this PR is to add
CopyToDeviceCache<T>
class template, that allows copying "arbitrary data" to all devices of a backend within an EDProducer (e.g. in a constructor orinitializeGlobalCache()
. The use case is to avoid (mis)using EventSetup for the purposes of copying EDProducer configuration parameters to the devices, following the discussion in #43130 (comment). TheCopyToDeviceCache
relies on the specializationsCopyToDevice<T>
to deduce the corresponding device-side data type, and to do the actual copy. The copy is done synchronously in the object constructor.For testing purposes I wanted to use Alpaka buffers, so I added partial specializations of
CopyToDevice<T>
for 0- and 1-dimensional Alpaka buffers.moveToDeviceAsync()
CopyToDeviceCache<T>
copies the object also for the host backends. In order to get some feeling what it would mean to avoid that copy, I explored first similar shortcut for the simpler case of doing it for one device e.g. inEDProducer::produce()
body (effectively addressing #43796 (comment)). I took the approach of exploring if the "move concept" could be used (which came up as an idea in some earlier discussion with @fwyzard). This lead tomoveToDeviceAsync()
function. For host backends it just moves the argument object, and for non-host backends it uses theCopyToDevice<T>
to deduce the corresponding device-side type, and to copy the data. Also in the non-host backend case the argument object is moved-from, so any use of the argument object in the caller will result a "use after move" (whatever that behavior then is for the object).I think the result is not too bad. However, given that Alpaka buffers behave like
std::shared_ptr
, I decided to require the host-side type to be non-copyable in order to be even remotely sure that the data of the moved-in host object would not be used (mostly written into) in the host code concurrently to the asynchronous copy to the device. I was concerned that otherwise the following kind of mistake would be too easy to doThe function can be easily used with
PortableHostCollection<T>
andPortableHostObject<T>
as they explicitly disable copying (for similar reasons).MoveToDeviceCache<T>
MoveToDeviceCache<T>
combines the "copy to all devices" aspect ofCopyToDeviceCache<T>
with the move semantics and "T
must be non-copyable" requirement ofmoveToDeviceAsync()
.Given that the only difference of
MoveToDeviceCache<T>
andCopyToDeviceCache<T>
is the behavior of the constructor, an alternative could be to have a single class, where the desired behavior would be selected e.g. with an explicit tag argument. Theoretically one could implement copy and move constructors such that the object is "moved" when possible and otherwise copied, but I though maybe we'd want to explicitly specify the desired behavior (copy vs move on host) to catch cases where a data type becomes accidentally copyable (leading to slower code path on host).Allow contained object of
PortableHostObject<T>
to be initialized in the constructorWhile crafting tests for the earlier cases I came to wonder if it would make sense to allow the contained object of
PortableHostObject<T>
(i.e.T
) to be initialized directly in thePortableHostObject<T>
constructor, rather than always having to first create thePortableHostObject<T>
, and then fill the content. This could be particularly useful when constructingMoveToDeviceCache<PortableHostObject<T>>
in the EDProducer initialization list. I added additional constructors that use placementnew
, and added a requirement that theT
must be trivially destructible.PR validation:
Unit tests run.