Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test for Alpaka libraries and build rules [14.0.x] #44635

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions HeterogeneousTest/AlpakaDevice/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
<use name="alpaka"/>
<use name="HeterogeneousCore/AlpakaInterface"/>
49 changes: 49 additions & 0 deletions HeterogeneousTest/AlpakaDevice/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Introduction

The packages `HeterogeneousTest/AlpakaDevice`, `HeterogeneousTest/AlpakaKernel`,
`HeterogeneousTest/AlpakaWrapper` and `HeterogeneousTest/AlpakaOpaque` implement a set of libraries,
plugins and tests to exercise the build rules for Alpaka.
In particular, these tests show what is supported and what are the limitations implementing
Alpaka-based libraries, and using them from multiple plugins.


# `HeterogeneousTest/AlpakaDevice`

The package `HeterogeneousTest/AlpakaDevice` implements a library that defines and exports Alpaka
device-side functions:
```c++
namespace cms::alpakatest {

template <typename TAcc>
ALPAKA_FN_ACC void add_vectors_f(TAcc const& acc, ...);

template <typename TAcc>
ALPAKA_FN_ACC void add_vectors_d(TAcc const& acc, ...);

} // namespace cms::alpakatest
```

The `plugins` directory implements the `AlpakaTestDeviceAdditionModule` `EDAnalyzer` that launches
an Alpaka kernel using the functions defined in ths library. As a byproduct this plugin also shows
how to split an `EDAnalyzer` or other framework plugin into a host-only part (in a `.cc` file) and
a device part (in a `.dev.cc` file).

The `test` directory implements the `testAlpakaDeviceAddition` binary that launches a Alpaka kernel
using these functions.
It also contains the `testAlpakaTestDeviceAdditionModule.py` python configuration to exercise the
`AlpakaTestDeviceAdditionModule` plugin.


# Other packages

For various ways in which this library and plugin can be tested, see also the other
`HeterogeneousTest/Alpaka...` packages:
- [`HeterogeneousTest/AlpakaKernel/README.md`](../../HeterogeneousTest/AlpakaKernel/README.md)
- [`HeterogeneousTest/AlpakaWrapper/README.md`](../../HeterogeneousTest/AlpakaWrapper/README.md)
- [`HeterogeneousTest/AlpakaOpaque/README.md`](../../HeterogeneousTest/AlpakaOpaque/README.md)


# Combining plugins

`HeterogeneousTest/AlpakaOpaque/test` contains the `testAlpakaTestAdditionModules.py` python
configuration that exercise all four plugins in a single application.
36 changes: 36 additions & 0 deletions HeterogeneousTest/AlpakaDevice/interface/alpaka/DeviceAddition.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
#ifndef HeterogeneousTest_AlpakaDevice_interface_alpaka_DeviceAddition_h
#define HeterogeneousTest_AlpakaDevice_interface_alpaka_DeviceAddition_h

#include <cstdint>

#include <alpaka/alpaka.hpp>

#include "HeterogeneousCore/AlpakaInterface/interface/workdivision.h"

namespace cms::alpakatest {

template <typename TAcc>
ALPAKA_FN_ACC void add_vectors_f(TAcc const& acc,
float const* __restrict__ in1,
float const* __restrict__ in2,
float* __restrict__ out,
uint32_t size) {
for (auto i : cms::alpakatools::uniform_elements(acc, size)) {
out[i] = in1[i] + in2[i];
}
}

template <typename TAcc>
ALPAKA_FN_ACC void add_vectors_d(TAcc const& acc,
double const* __restrict__ in1,
double const* __restrict__ in2,
double* __restrict__ out,
uint32_t size) {
for (auto i : cms::alpakatools::uniform_elements(acc, size)) {
out[i] = in1[i] + in2[i];
}
}

} // namespace cms::alpakatest

#endif // HeterogeneousTest_AlpakaDevice_interface_alpaka_DeviceAddition_h
11 changes: 11 additions & 0 deletions HeterogeneousTest/AlpakaDevice/plugins/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<library file="alpaka/*.cc" name="HeterogeneousTestAlpakaDevicePlugins">
<use name="alpaka"/>
<use name="FWCore/Framework"/>
<use name="FWCore/ParameterSet"/>
<use name="FWCore/ServiceRegistry"/>
<use name="HeterogeneousCore/AlpakaInterface"/>
<use name="HeterogeneousCore/AlpakaServices"/>
<use name="HeterogeneousTest/AlpakaDevice"/>
<flags ALPAKA_BACKENDS="1"/>
<flags EDM_PLUGIN="1"/>
</library>
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#include <cstdint>

#include <alpaka/alpaka.hpp>

#include "HeterogeneousCore/AlpakaInterface/interface/config.h"
#include "HeterogeneousCore/AlpakaInterface/interface/workdivision.h"
#include "HeterogeneousTest/AlpakaDevice/interface/alpaka/DeviceAddition.h"

#include "AlpakaTestDeviceAdditionAlgo.h"

namespace ALPAKA_ACCELERATOR_NAMESPACE::HeterogeneousTestAlpakaDevicePlugins {

struct KernelAddVectorsF {
template <typename TAcc>
ALPAKA_FN_ACC void operator()(TAcc const& acc,
const float* __restrict__ in1,
const float* __restrict__ in2,
float* __restrict__ out,
uint32_t size) const {
cms::alpakatest::add_vectors_f(acc, in1, in2, out, size);
}
};

void wrapper_add_vectors_f(Queue& queue,
const float* __restrict__ in1,
const float* __restrict__ in2,
float* __restrict__ out,
uint32_t size) {
alpaka::exec<Acc1D>(queue, cms::alpakatools::make_workdiv<Acc1D>(32, 32), KernelAddVectorsF{}, in1, in2, out, size);
}

} // namespace ALPAKA_ACCELERATOR_NAMESPACE::HeterogeneousTestAlpakaDevicePlugins
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#ifndef HeterogeneousTest_AlpakaDevice_plugins_alpaka_AlpakaTestDeviceAdditionAlgo_h
#define HeterogeneousTest_AlpakaDevice_plugins_alpaka_AlpakaTestDeviceAdditionAlgo_h

#include <cstdint>

#include "HeterogeneousCore/AlpakaInterface/interface/config.h"

namespace ALPAKA_ACCELERATOR_NAMESPACE::HeterogeneousTestAlpakaDevicePlugins {

void wrapper_add_vectors_f(Queue& queue,
const float* __restrict__ in1,
const float* __restrict__ in2,
float* __restrict__ out,
uint32_t size);

} // namespace ALPAKA_ACCELERATOR_NAMESPACE::HeterogeneousTestAlpakaDevicePlugins

#endif // HeterogeneousTest_AlpakaDevice_plugins_alpaka_AlpakaTestDeviceAdditionAlgo_h
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
#include <cstdint>
#include <iostream>
#include <random>
#include <vector>

#include <alpaka/alpaka.hpp>

#include "FWCore/Framework/interface/Event.h"
#include "FWCore/Framework/interface/Frameworkfwd.h"
#include "FWCore/Framework/interface/global/EDAnalyzer.h"
#include "FWCore/ParameterSet/interface/ConfigurationDescriptions.h"
#include "FWCore/ParameterSet/interface/ParameterSet.h"
#include "FWCore/ParameterSet/interface/ParameterSetDescription.h"
#include "FWCore/ServiceRegistry/interface/Service.h"
#include "HeterogeneousCore/AlpakaInterface/interface/config.h"
#include "HeterogeneousCore/AlpakaInterface/interface/memory.h"
#include "HeterogeneousCore/AlpakaServices/interface/alpaka/AlpakaService.h"

#include "AlpakaTestDeviceAdditionAlgo.h"

namespace ALPAKA_ACCELERATOR_NAMESPACE {

class AlpakaTestDeviceAdditionModule : public edm::global::EDAnalyzer<> {
public:
explicit AlpakaTestDeviceAdditionModule(edm::ParameterSet const& config);
~AlpakaTestDeviceAdditionModule() override = default;

static void fillDescriptions(edm::ConfigurationDescriptions& descriptions);

void analyze(edm::StreamID, edm::Event const& event, edm::EventSetup const& setup) const override;

private:
const uint32_t size_;
};

AlpakaTestDeviceAdditionModule::AlpakaTestDeviceAdditionModule(edm::ParameterSet const& config)
: size_(config.getParameter<uint32_t>("size")) {}

void AlpakaTestDeviceAdditionModule::fillDescriptions(edm::ConfigurationDescriptions& descriptions) {
edm::ParameterSetDescription desc;
desc.add<uint32_t>("size", 1024 * 1024);

// ignore the alpaka = cms.untracked.PSet(...) injected by the framework
edm::ParameterSetDescription alpaka;
alpaka.setAllowAnything();
desc.addUntracked<edm::ParameterSetDescription>("alpaka", alpaka);

descriptions.addWithDefaultLabel(desc);
}

void AlpakaTestDeviceAdditionModule::analyze(edm::StreamID,
edm::Event const& event,
edm::EventSetup const& setup) const {
// require a valid Alpaka backend for running
edm::Service<ALPAKA_TYPE_ALIAS(AlpakaService)> service;
if (not service or not service->enabled()) {
std::cout << "The " << ALPAKA_TYPE_ALIAS_NAME(AlpakaService)
<< " is not available or disabled, the test will be skipped.\n";
return;
}

// random number generator with a gaussian distribution
std::random_device rd{};
std::default_random_engine rand{rd()};
std::normal_distribution<float> dist{0., 1.};

// tolerance
constexpr float epsilon = 0.000001;

// allocate input and output host buffers
std::vector<float> in1_h(size_);
std::vector<float> in2_h(size_);
std::vector<float> out_h(size_);

// fill the input buffers with random data, and the output buffer with zeros
for (uint32_t i = 0; i < size_; ++i) {
in1_h[i] = dist(rand);
in2_h[i] = dist(rand);
out_h[i] = 0.;
}

// run the test on all available devices
for (auto const& device : cms::alpakatools::devices<Platform>()) {
Queue queue{device};

// allocate input and output buffers on the device
auto in1_d = cms::alpakatools::make_device_buffer<float[]>(queue, size_);
auto in2_d = cms::alpakatools::make_device_buffer<float[]>(queue, size_);
auto out_d = cms::alpakatools::make_device_buffer<float[]>(queue, size_);

// copy the input data to the device
// FIXME: pass the explicit size of type uint32_t to avoid compilation error
// The destination view and the extent are required to have compatible index types!
alpaka::memcpy(queue, in1_d, in1_h, size_);
alpaka::memcpy(queue, in2_d, in2_h, size_);

// fill the output buffer with zeros
alpaka::memset(queue, out_d, 0);

// launch the 1-dimensional kernel for vector addition
HeterogeneousTestAlpakaDevicePlugins::wrapper_add_vectors_f(
queue, in1_d.data(), in2_d.data(), out_d.data(), size_);

// copy the results from the device to the host
alpaka::memcpy(queue, out_h, out_d);

// wait for all the operations to complete
alpaka::wait(queue);

// check the results
for (uint32_t i = 0; i < size_; ++i) {
float sum = in1_h[i] + in2_h[i];
assert(out_h[i] < sum + epsilon);
assert(out_h[i] > sum - epsilon);
}
}

std::cout << "All tests passed.\n";
}

} // namespace ALPAKA_ACCELERATOR_NAMESPACE

#include "HeterogeneousCore/AlpakaCore/interface/alpaka/MakerMacros.h"
DEFINE_FWK_ALPAKA_MODULE(AlpakaTestDeviceAdditionModule);
9 changes: 9 additions & 0 deletions HeterogeneousTest/AlpakaDevice/test/BuildFile.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<bin file="alpaka/testDeviceAddition.dev.cc" name="testAlpakaDeviceAddition">
<use name="catch2"/>
<use name="alpaka"/>
<use name="HeterogeneousTest/AlpakaDevice"/>
<use name="HeterogeneousCore/AlpakaInterface"/>
<flags ALPAKA_BACKENDS="1"/>
</bin>

<test name="testAlpakaTestDeviceAdditionModule" command="cmsRun ${LOCALTOP}/src/HeterogeneousTest/AlpakaDevice/test/testAlpakaTestDeviceAdditionModule.py"/>
102 changes: 102 additions & 0 deletions HeterogeneousTest/AlpakaDevice/test/alpaka/testDeviceAddition.dev.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
#include <cstdint>
#include <random>
#include <vector>

#define CATCH_CONFIG_MAIN
#include <catch.hpp>

#include <alpaka/alpaka.hpp>

#include "HeterogeneousCore/AlpakaInterface/interface/config.h"
#include "HeterogeneousCore/AlpakaInterface/interface/devices.h"
#include "HeterogeneousCore/AlpakaInterface/interface/memory.h"
#include "HeterogeneousCore/AlpakaInterface/interface/workdivision.h"
#include "HeterogeneousTest/AlpakaDevice/interface/alpaka/DeviceAddition.h"

using namespace ALPAKA_ACCELERATOR_NAMESPACE;

struct KernelAddVectorsF {
template <typename TAcc>
ALPAKA_FN_ACC void operator()(TAcc const& acc,
const float* __restrict__ in1,
const float* __restrict__ in2,
float* __restrict__ out,
uint32_t size) const {
cms::alpakatest::add_vectors_f(acc, in1, in2, out, size);
}
};

TEST_CASE("HeterogeneousTest/AlpakaDevice test", "[alpakaTestDeviceAddition]") {
auto const& devices = cms::alpakatools::devices<Platform>();
if (devices.empty()) {
FAIL("No devices available for the " EDM_STRINGIZE(ALPAKA_ACCELERATOR_NAMESPACE) " backend, "
"the test will be skipped.");
}

// random number generator with a gaussian distribution
std::random_device rd{};
std::default_random_engine rand{rd()};
std::normal_distribution<float> dist{0., 1.};

// tolerance
constexpr float epsilon = 0.000001;

// buffer size
constexpr uint32_t size = 1024 * 1024;

// allocate input and output host buffers
std::vector<float> in1_h(size);
std::vector<float> in2_h(size);
std::vector<float> out_h(size);

// fill the input buffers with random data, and the output buffer with zeros
for (uint32_t i = 0; i < size; ++i) {
in1_h[i] = dist(rand);
in2_h[i] = dist(rand);
out_h[i] = 0.;
}

// run the test on all available devices
for (auto const& device : cms::alpakatools::devices<Platform>()) {
SECTION("Test add_vectors_f on " EDM_STRINGIZE(ALPAKA_ACCELERATOR_NAMESPACE) " backend") {
REQUIRE_NOTHROW([&]() {
Queue queue{device};

// allocate input and output buffers on the device
auto in1_d = cms::alpakatools::make_device_buffer<float[]>(queue, size);
auto in2_d = cms::alpakatools::make_device_buffer<float[]>(queue, size);
auto out_d = cms::alpakatools::make_device_buffer<float[]>(queue, size);

// copy the input data to the device
// FIXME: pass the explicit size of type uint32_t to avoid compilation error
// The destination view and the extent are required to have compatible index types!
alpaka::memcpy(queue, in1_d, in1_h, size);
alpaka::memcpy(queue, in2_d, in2_h, size);

// fill the output buffer with zeros
alpaka::memset(queue, out_d, 0);

// launch the 1-dimensional kernel for vector addition
alpaka::exec<Acc1D>(queue,
cms::alpakatools::make_workdiv<Acc1D>(32, 32),
KernelAddVectorsF{},
in1_d.data(),
in2_d.data(),
out_d.data(),
size);

// copy the results from the device to the host
alpaka::memcpy(queue, out_h, out_d, size);

// wait for all the operations to complete
alpaka::wait(queue);
}());

// check the results
for (uint32_t i = 0; i < size; ++i) {
float sum = in1_h[i] + in2_h[i];
CHECK_THAT(out_h[i], Catch::Matchers::WithinAbs(sum, epsilon));
}
}
}
}
Loading