-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding HeterogeneousCore/AlpakaUtilities and changes into HeterogeneousCore/AlpakaInterface #40932
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40932/34416 |
A new Pull Request was created by @borzari for master. It involves the following packages:
The following packages do not have a category, yet: HeterogeneousCore/AlpakaUtilities @cmsbuild, @makortel, @fwyzard can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
8dbc953
to
7712e91
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40932/34417
|
My first high-level question is if the code really warrants a new package Last week we discussed with @fwyzard and @rovere about the placement of |
In principle yes, I didn't look at the implementation in this PR. From a quick look, this code needs a thorough review (for example, I'm convinced there is a lot of duplication), but I won't have much time until Thursday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From first pass
<use name="rootcore"/> | ||
<use name="boost"/> | ||
<use name="eigen"/> | ||
<use name="DataFormats/Common"/> | ||
<use name="DataFormats/Portable"/> | ||
<use name="DataFormats/SoATemplate"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, none of these dependencies are used. Or is there further code in the pipeline that would bring in some of these dependencies?
<use name="DataFormats/Portable"/> | ||
<use name="DataFormats/SoATemplate"/> | ||
<use name="HeterogeneousCore/AlpakaInterface"/> | ||
<flags ALPAKA_BACKENDS="1"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ALPAKA_BACKENDS
flag is not needed either as there is no code making use of ALPAKA_ACCELERATOR_NAMESPACE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if that even does anything, for files that are just header files and are not inside the alpaka/
subdirectory ?
#include "AtomicPairCounter.h" | ||
#include "alpakastdAlgorithm.h" | ||
#include "prefixScan.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our convention is use full paths to the headers even if they are in the interface/
directory of the same package (only exceptions are cases where the source and header files are in src
, plugins
, or test
).
}; | ||
|
||
template <typename TAcc, typename Histo, typename TQueue> | ||
ALPAKA_FN_HOST ALPAKA_FN_INLINE __attribute__((always_inline)) void launchZero(Histo *__restrict__ h, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How necessary is ALPAKA_FN_HOST
?
How necessary is ALPAKA_FN_INLINE
given the __attribute__((always_inline))
?
alpaka::enqueue( | ||
queue, | ||
alpaka::createTaskKernel<TAcc>(workDiv, multiBlockPrefixScanFirstStep<uint32_t>(), poff, poff, num_items)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
alpaka::enqueue( | |
queue, | |
alpaka::createTaskKernel<TAcc>(workDiv, multiBlockPrefixScanFirstStep<uint32_t>(), poff, poff, num_items)); | |
alpaka::exec<TAcc>( | |
queue, | |
workDiv, multiBlockPrefixScanFirstStep<uint32_t>(), poff, poff, num_items); |
instead? (also below)
#include "prefixScan.h" | ||
|
||
#include "HeterogeneousCore/AlpakaInterface/interface/memory.h" | ||
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a cursory look the config.h
looks like it would not be needed here. Is it needed?
#include <type_traits> | ||
#include <utility> | ||
|
||
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be replaced as
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" | |
#include <alpaka/alpaka.hpp> |
?
|
||
#include <utility> | ||
|
||
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be replaced as
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" | |
#include <alpaka/alpaka.hpp> |
?
|
||
template <typename T = void> | ||
struct less { | ||
ALPAKA_FN_HOST_ACC constexpr bool operator()(const T &lhs, const T &rhs) const { return lhs < rhs; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to cppreference, std::less::operator()
has been constexpr
since C++14. Is a separate implementation still needed?
#include <cstdint> | ||
#include <type_traits> | ||
|
||
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be replaced as
#include "HeterogeneousCore/AlpakaInterface/interface/config.h" | |
#include <alpaka/alpaka.hpp> |
?
@@ -302,6 +302,237 @@ namespace cms::alpakatools { | |||
const Vec extent_; | |||
}; | |||
|
|||
/********************************************* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@borzari @nothingface0 could you point me to the use cases for
element_index_range_in_block
element_index_range_in_block_truncated
element_index_range_in_grid
element_index_range_in_grid_truncated
and
for_each_element_in_block
for_each_element_in_block_strided
for_each_element_in_grid_strided
?
Regarding the latter, do you prefer the lambda approach (pass a function object or lambda to the for_each_element_in_...
call) or the range loop approach (get the index via a range loop, as with elements_with_stride
) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
element_index_range_in_block
workdivision.h
here and here. It's then used by element_index_range_in_block_truncated
and element_index_range_in_grid
.
gpuClusterChargeCut.h
gpuPixelRecHits.h
gpuFishbone.h
gpuPixelDoubletsAlgos.h
element_index_range_in_block_truncated
workdivision.h
here. See for_each_element_in_block
for implicit uses.
element_index_range_in_grid
workdivision.h
Inside for_each_element_in_grid_strided
here.
gpuFishbone.h
gpuPixelDoubletsAlgos.h
element_index_range_in_grid_truncated
workdivision.h
Only used from overloaded function here. Not used anywhere else, can be deleted.
for_each_element_in_block
workdivision.h
Call from overloaded function here.
radixSort.h
for_each_element_in_block_strided
workdivision.h
Call from overloaded function here.
radixSort.h
here, here, here, here, here, here, here, here, here, here, here, and here.
for_each_element_in_grid_strided
workdivision.h
Call from overloaded function here.
HistoContainer.h
Comments
Since we didn't have the time to go in-depth in the patatrack-standalone
algorithms, we did not try to change different ways of looping over the elements so that they're more uniform.
That said, personally speaking, the range approach (elements_with_stride
) seems more intuitive, and it's cleaner in the code, too.
A visualized overview of the functions that depend on those functions can be seen below for easier(?) comprehension:
|
||
static constexpr c_type incr = 1UL << 32; | ||
|
||
ALPAKA_FN_HOST_ACC Counters get() const { return counter.counters; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be ALPAKA_FN_HOST_ACC
or would it be enough to be ALPAKA_FN_ACC
?
// reimplementation of std algorithms able to compile with Alpaka, | ||
// mostly by declaring them constexpr | ||
|
||
namespace alpaka_std { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@makortel a technical question: would it work (and be allowed) to declare this as namespace alpaka::std
?
If we do that, any code inside namespace alpaka { ... }
that uses std::lower_bound
would pick which version ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be wary in using namespace std
even inside another namespace.
If we do that, any code inside
namespace alpaka { ... }
that usesstd::lower_bound
would pick which version ?
Based on our early attempts to use cms::alpaka
namespace, and alpaka::<X>
types there, I guess std::lower_bound
inside namespace alpaka { ... }
would resolve to alpaka::std::lower_bound
, if the header defining it is #include
d, and otherwise to ::std::lower_bound
. I also guess #include
ng a header defining anything in alpaka::std
namespace would cause code in alpaka
namespace to need to use ::std::
for all std::
names that are not found in alpaka::std
.
But within CMSSW we are not going to add names in alpaka
namespace, are we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But within CMSSW we are not going to add names in
alpaka
namespace, are we?
No, but I'm wondering in case we push these upstream into Alpaka.
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-40932/34443
|
Milestone for this pull request has been moved to CMSSW_14_0_X.Please open a backport if it should also go in to CMSSW_13_3_X. |
Milestone for this pull request has been moved to CMSSW_14_1_X. Please open a backport if it should also go in to CMSSW_14_0_X. |
@borzari am I right that these changes have been merged as part of the large Alpaka migration ? If so, can you close this PR ? |
Part 1/? of porting pixel track reconstruction code from CUDA to Alpaka. AlpakaUtilities have methods and tools to ease the work with Alpaka (based and modified from https://github.com/cms-patatrack/pixeltrack-standalone/tree/master/src/alpaka/AlpakaCore). AlpakaInterface changes comprise addition of functions for block/thread iteration inside kernels.
This PR is a suggestion from @makortel during 12th Patatrack Hackathon to split the work in several PRs to make incremental reviews.