Skip to content

Commit

Permalink
Refactor window storage (NVIDIA#627)
Browse files Browse the repository at this point in the history
Closes NVIDIA#621

Based on the offline discussions, this PR replaces the `window` logic
with `bucket` and adds new overloads of `make_bucket_extent` so OA no
longer relies on the `Container` type to determine the bucket extent.

---------

Co-authored-by: Daniel Jünger <[email protected]>
  • Loading branch information
PointKernel and sleeepyjack authored Nov 5, 2024
1 parent 5b4a80e commit 93d6172
Show file tree
Hide file tree
Showing 37 changed files with 807 additions and 630 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ Similar to how [Thrust](https://github.com/thrust/thrust) and [CUB](https://gith

### Major Updates

__11/01/2024__ Refined the term `window` as `bucket`

__01/08/2024__ Deprecated the `experimental` namespace

__01/02/2024__ Moved the legacy `static_map` to `cuco::legacy` namespace
Expand Down Expand Up @@ -254,4 +256,4 @@ We plan to add many GPU-accelerated, concurrent data structures to `cuCollection
`cuco::bloom_filter` implements a Blocked Bloom Filter for approximate set membership queries.

#### Examples:
- [Host-bulk APIs (Default fingerprinting policy)](https://github.com/NVIDIA/cuCollections/blob/dev/examples/bloom_filter/host_bulk_example.cu) (see [live example in godbolt](https://godbolt.org/clientstate/eJydVmtvGjkU_StXsx8WmuEVbVUJQiSapLtoK5IF2qpaVsjj8TBWBnvqBwRF-e977ZmBgZBqtVRqwL6Pc889vvZzoJnWXAod9P9-Dngc9HthkBGxsmTFgn5AbUyCMNDSKup-d94tBLyDG5nvFF-lBhq0CZfdy99CmHwd345HcHM_fbifjubj-0nb2Xr7z5wyoVkMVsRMgUkZjHJC8U-5E8JXphwQuGx3oeEMFkG5twiaAx9lJy2syQ6ENGA1wzBcQ8IzBuyJstwAF0DlOs84EZTBlpvUpyrjeDjwvQwiI0PQnqBHjr-SuiUQs4fuPqkxeb_T2W63beJht6VadbLCWHc-j2_uJrO7FkLfu30RGTILiv2wXGHh0Q5IjsgoiRBvRrYgFZCVYrhnpEO-VdxwsQpBy8RsiWI-Tsy1UTyy5oi8CifWXzdA-ohA4kYzGM8WAXwczcaz0Mf5Np7_cf9lDt9G0-loMh_fzeB-is2a3I5dq_DXJxhNvsOf48ltCAypw1TsKVeuCoTKHa0sLjicMXYEI5EFLJ0zyhNOoVIQrOSGKYFlQc7UmhdaQ5Cxj5PxNTfE-LVXxflUnYVYiF-4oJmNGVxRS2UnyqRcL7Hvhqk2ten1sY1JldWmQ6UVpu02X23FbIMplhtGjVTnTdgTo9YBW-YSm7Y7b6Wxuwyl1j7FwCU2hZG1X-bCoOK4aGwkj5sL8YyFgVukWLdxHIOw6-Uj22kntiH0ur92u90B7D-dTucKfmeCKWJYuQ3O_nwkkxduw0Pcd9Btvx-UkcbIrjKe64QrbSAlWeLjuWCy3PD0vpFAvErQKjMPXntoG5Xd0uhx6SvzOPCr22xVmygR8M0tszvynKmDBmaXF0I7tgCcBE5eaLL0JkOXfbB3neVk60553auYDDFLiM0MFA12mjwF5Kt3kuv365q7qnJdl_GeaxW-lKm1ift91KCBqys8kR9t9ojAPO-e47fxJFgOU7lyJK6Knru5WMDsY6yFgNoHw_tcTMRZmbtQZ79_pPMaatewRtU5P1v3LpWiG26rHbEVyrYZeo82ZnDfe80yDbFOK_nSWyHvdZdB3QAdvVj2thd1sRRWYh-mcDjaqgIcYBwafF7M5Tz3Ds6wlDOJ40aFIiwzNX_KWiRldu0scRBidxoF8BASkml2zN1ZR1F3PDoptRhlKX9Zpna107efqziFN1xane1KDeGs35eGjnN3EepU2iyGIh34q80oy1q51HixbBj44YHEzB-mw16dFTyq7i7Ur6gJD4VXjf0vcH2ZJiUG8Bbzt7W7RpjwqE6Gizui3N3W9QOhvWOuZEQyvN7wQomJIahzZamxGCushSmjsKeUR9yga8HrSd2fHqbXXTxjOZbl5oUskGArIkSOrHhfXtVWq8oNhmPTlOiUafcEif3MwnrP0yn2dIqSTvEGnUkmkS7HtgM7LI64X2xU8vIXWuNYbGcaVF8qT6yTQbMJnTJgIb9Cu0Xe5H_kPa2kvvSTvKJ5fkA6UeKLBb9XJPhl17dyPTmsnw48xVAVArr48wVfq-4NiK9CdXjUBmJDae_yve3htsxN8eINWhhoSC8ueh-gRRRNh3q9_NCFVgvvLYP_GczB4lZG1pF_Bmc8qsWklGa4uCkerriA9YrH4CWs9vHmONpH7oKXf_y_fwHeCexw))
- [Host-bulk APIs (Default fingerprinting policy)](https://github.com/NVIDIA/cuCollections/blob/dev/examples/bloom_filter/host_bulk_example.cu) (see [live example in godbolt](https://godbolt.org/clientstate/eJydVmtvGjkU_StXsx8WmuEVbVUJQiSapLtoK5IF2qpaVsjj8TBWBnvqBwRF-e977ZmBgZBqtVRqwL6Pc889vvZzoJnWXAod9P9-Dngc9HthkBGxsmTFgn5AbUyCMNDSKup-d94tBLyDG5nvFF-lBhq0CZfdy99CmHwd345HcHM_fbifjubj-0nb2Xr7z5wyoVkMVsRMgUkZjHJC8U-5E8JXphwQuGx3oeEMFkG5twiaAx9lJy2syQ6ENGA1wzBcQ8IzBuyJstwAF0DlOs84EZTBlpvUpyrjeDjwvQwiI0PQnqBHjr-SuiUQs4fuPqkxeb_T2W63beJht6VadbLCWHc-j2_uJrO7FkLfu30RGTILiv2wXGHh0Q5IjsgoiRBvRrYgFZCVYrhnpEO-VdxwsQpBy8RsiWI-Tsy1UTyy5oi8CifWXzdA-ohA4kYzGM8WAXwczcaz0Mf5Np7_cf9lDt9G0-loMh_fzeB-is2a3I5dq_DXJxhNvsOf48ltCAypw1TsKVeuCoTKHa0sLjicMXYEI5EFLJ0zyhNOoVIQrOSGKYFlQc7UmhdaQ5Cxj5PxNTfE-LVXxflUnYVYiF-4oJmNGVxRS2UnyqRcL7Hvhqk2ten1sY1JldWmQ6UVpu02X23FbIMplhtGjVTnTdgTo9YBW-YSm7Y7b6Wxuwyl1j7FwCU2hZG1X-bCoOK4aGwkj5sL8YyFgVukWLdxHIOw6-Uj22kntiH0ur92u90B7D-dTucKfmeCKWJYuQ3O_nwkkxduw0Pcd9Btvx-UkcbIrjKe64QrbSAlWeLjuWCy3PD0vpFAvErQKjMPXntoG5Xd0uhx6SvzOPCr22xVmygR8M0tszvynKmDBmaXF0I7tgCcBE5eaLL0JkOXfbB3neVk60553auYDDFLiM0MFA12mjwF5Kt3kuv365q7qnJdl_GeaxW-lKm1ift91KCBqys8kR9t9ojAPO-e47fxJFgOU7lyJK6Knru5WMDsY6yFgNoHw_tcTMRZmbtQZ79_pPMaatewRtU5P1v3LpWiG26rHbEVyrYZeo82ZnDfe80yDbFOK_nSWyHvdZdB3QAdvVj2thd1sRRWYh-mcDjaqgIcYBwafF7M5Tz3Ds6wlDOJ40aFIiwzNX_KWiRldu0scRBidxoF8BASkml2zN1ZR1F3PDoptRhlKX9Zpna107efqziFN1xane1KDeGs35eGjnN3EepU2iyGIh34q80oy1q51HixbBj44YHEzB-mw16dFTyq7i7Ur6gJD4VXjf0vcH2ZJiUG8Bbzt7W7RpjwqE6Gizui3N3W9QOhvWOuZEQyvN7wQomJIahzZamxGCushSmjsKeUR9yga8HrSd2fHqbXXTxjOZbl5oUskGArIkSOrHhfXtVWq8oNhmPTlOiUafcEif3MwnrP0yn2dIqSTvEGnUkmkS7HtgM7LI64X2xU8vIXWuNYbGcaVF8qT6yTQbMJnTJgIb9Cu0Xe5H_kPa2kvvSTvKJ5fkA6UeKLBb9XJPhl17dyPTmsnw48xVAVArr48wVfq-4NiK9CdXjUBmJDae_yve3htsxN8eINWhhoSC8ueh-gRRRNh3q9_NCFVgvvLYP_GczB4lZG1pF_Bmc8qsWklGa4uCkerriA9YrH4CWs9vHmONpH7oKXf_y_fwHeCexw))
229 changes: 121 additions & 108 deletions include/cuco/aow_storage.cuh → include/cuco/bucket_storage.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

#pragma once

#include <cuco/detail/storage/aow_storage_base.cuh>
#include <cuco/detail/storage/bucket_storage_base.cuh>
#include <cuco/extent.cuh>
#include <cuco/utility/allocator.hpp>

Expand All @@ -29,200 +29,213 @@
#include <memory>

namespace cuco {
/// Bucket type alias
template <typename T, int32_t BucketSize>
using bucket = detail::bucket<T, BucketSize>;

/// Window type alias
template <typename T, int32_t WindowSize>
using window = detail::window<T, WindowSize>;

/// forward declaration
template <typename T, int32_t WindowSize, typename Extent>
class aow_storage_ref;
/// Alias for bucket
template <typename T, int32_t BucketSize>
using window = bucket<T, BucketSize>;

/**
* @brief Array of Window open addressing storage class.
* @brief Non-owning array of buckets storage reference type.
*
* @tparam T Slot type
* @tparam WindowSize Number of slots in each window
* @tparam Extent Type of extent denoting number of windows
* @tparam Allocator Type of allocator used for device storage (de)allocation
* @tparam T Storage element type
* @tparam BucketSize Number of slots in each bucket
* @tparam Extent Type of extent denoting storage capacity
*/
template <typename T,
int32_t WindowSize,
typename Extent = cuco::extent<std::size_t>,
typename Allocator = cuco::cuda_allocator<cuco::window<T, WindowSize>>>
class aow_storage : public detail::aow_storage_base<T, WindowSize, Extent> {
template <typename T, int32_t BucketSize, typename Extent = cuco::extent<std::size_t>>
class bucket_storage_ref : public detail::bucket_storage_base<T, BucketSize, Extent> {
public:
using base_type = detail::aow_storage_base<T, WindowSize, Extent>; ///< AoW base class type
/// Array of buckets base class type
using base_type = detail::bucket_storage_base<T, BucketSize, Extent>;

using base_type::window_size; ///< Number of elements processed per window
using base_type::bucket_size; ///< Number of elements processed per bucket

using extent_type = typename base_type::extent_type; ///< Storage extent type
using size_type = typename base_type::size_type; ///< Storage size type
using value_type = typename base_type::value_type; ///< Slot type
using window_type = typename base_type::window_type; ///< Slot window type
using bucket_type = typename base_type::bucket_type; ///< Slot bucket type

using base_type::capacity;
using base_type::num_windows;

/// Type of the allocator to (de)allocate windows
using allocator_type =
typename std::allocator_traits<Allocator>::template rebind_alloc<window_type>;
using window_deleter_type =
detail::custom_deleter<size_type, allocator_type>; ///< Type of window deleter
using ref_type = aow_storage_ref<value_type, window_size, extent_type>; ///< Storage ref type
using base_type::num_buckets;

/**
* @brief Constructor of AoW storage.
*
* @note The input `size` should be exclusively determined by the return value of
* `make_window_extent` since it depends on the requested low-bound value, the probing scheme, and
* the storage.
* @brief Constructor of AoS storage ref.
*
* @param size Number of windows to (de)allocate
* @param allocator Allocator used for (de)allocating device storage
* @param size Number of buckets
* @param buckets Pointer to the buckets array
*/
explicit constexpr aow_storage(Extent size, Allocator const& allocator = {});
__host__ __device__ explicit constexpr bucket_storage_ref(Extent size,
bucket_type* buckets) noexcept;

aow_storage(aow_storage&&) = default; ///< Move constructor
/**
* @brief Replaces the contents of the storage with another storage.
* @brief Custom un-incrementable input iterator for the convenience of `find` operations.
*
* @return Reference of the current storage object
* @note This iterator is for read only and NOT incrementable.
*/
aow_storage& operator=(aow_storage&&) = default;
~aow_storage() = default; ///< Destructor

aow_storage(aow_storage const&) = delete;
aow_storage& operator=(aow_storage const&) = delete;
struct iterator;
using const_iterator = iterator const; ///< Const forward iterator type

/**
* @brief Gets windows array.
* @brief Returns an iterator to one past the last slot.
*
* This is provided for convenience for those familiar with checking
* an iterator returned from `find()` against the `end()` iterator.
*
* @return Pointer to the first window
* @return An iterator to one past the last slot
*/
[[nodiscard]] constexpr window_type* data() const noexcept;
[[nodiscard]] __device__ constexpr iterator end() noexcept;

/**
* @brief Gets the storage allocator.
* @brief Returns a const_iterator to one past the last slot.
*
* @return The storage allocator
* This is provided for convenience for those familiar with checking
* an iterator returned from `find()` against the `end()` iterator.
*
* @return A const_iterator to one past the last slot
*/
[[nodiscard]] constexpr allocator_type allocator() const noexcept;
[[nodiscard]] __device__ constexpr const_iterator end() const noexcept;

/**
* @brief Gets window storage reference.
* @brief Gets buckets array.
*
* @return Reference of window storage
* @return Pointer to the first bucket
*/
[[nodiscard]] constexpr ref_type ref() const noexcept;
[[nodiscard]] __device__ constexpr bucket_type* data() noexcept;

/**
* @brief Initializes each slot in the AoW storage to contain `key`.
* @brief Gets bucket array.
*
* @param key Key to which all keys in `slots` are initialized
* @param stream Stream used for executing the kernel
* @return Pointer to the first bucket
*/
void initialize(value_type key, cuda::stream_ref stream = {});
[[nodiscard]] __device__ constexpr bucket_type* data() const noexcept;

/**
* @brief Asynchronously initializes each slot in the AoW storage to contain `key`.
* @brief Returns an array of slots (or a bucket) for a given index.
*
* @param key Key to which all keys in `slots` are initialized
* @param stream Stream used for executing the kernel
* @param index Index of the bucket
* @return An array of slots
*/
void initialize_async(value_type key, cuda::stream_ref stream = {}) noexcept;
[[nodiscard]] __device__ constexpr bucket_type operator[](size_type index) const noexcept;

private:
allocator_type allocator_; ///< Allocator used to (de)allocate windows
window_deleter_type window_deleter_; ///< Custom windows deleter
std::unique_ptr<window_type, window_deleter_type> windows_; ///< Pointer to AoW storage
bucket_type* buckets_; ///< Pointer to the buckets array
};

/**
* @brief Non-owning AoW storage reference type.
* @brief Array of buckets open addressing storage class.
*
* @tparam T Storage element type
* @tparam WindowSize Number of slots in each window
* @tparam Extent Type of extent denoting storage capacity
* @tparam T Slot type
* @tparam BucketSize Number of slots in each bucket
* @tparam Extent Type of extent denoting number of buckets
* @tparam Allocator Type of allocator used for device storage (de)allocation
*/
template <typename T, int32_t WindowSize, typename Extent = cuco::extent<std::size_t>>
class aow_storage_ref : public detail::aow_storage_base<T, WindowSize, Extent> {
template <typename T,
int32_t BucketSize,
typename Extent = cuco::extent<std::size_t>,
typename Allocator = cuco::cuda_allocator<cuco::bucket<T, BucketSize>>>
class bucket_storage : public detail::bucket_storage_base<T, BucketSize, Extent> {
public:
using base_type = detail::aow_storage_base<T, WindowSize, Extent>; ///< AoW base class type
/// Array of buckets base class type
using base_type = detail::bucket_storage_base<T, BucketSize, Extent>;

using base_type::window_size; ///< Number of elements processed per window
using base_type::bucket_size; ///< Number of elements processed per bucket

using extent_type = typename base_type::extent_type; ///< Storage extent type
using size_type = typename base_type::size_type; ///< Storage size type
using value_type = typename base_type::value_type; ///< Slot type
using window_type = typename base_type::window_type; ///< Slot window type
using bucket_type = typename base_type::bucket_type; ///< Slot bucket type

using base_type::capacity;
using base_type::num_windows;
using base_type::num_buckets;

/// Type of the allocator to (de)allocate buckets
using allocator_type =
typename std::allocator_traits<Allocator>::template rebind_alloc<bucket_type>;
using bucket_deleter_type =
detail::custom_deleter<size_type, allocator_type>; ///< Type of bucket deleter
using ref_type = bucket_storage_ref<value_type, bucket_size, extent_type>; ///< Storage ref type

/**
* @brief Constructor of AoS storage ref.
* @brief Constructor of bucket storage.
*
* @note The input `size` should be exclusively determined by the return value of
* `make_bucket_extent` since it depends on the requested low-bound value, the probing scheme, and
* the storage.
*
* @param size Number of windows
* @param windows Pointer to the windows array
* @param size Number of buckets to (de)allocate
* @param allocator Allocator used for (de)allocating device storage
*/
__host__ __device__ explicit constexpr aow_storage_ref(Extent size,
window_type* windows) noexcept;
explicit constexpr bucket_storage(Extent size, Allocator const& allocator = {});

bucket_storage(bucket_storage&&) = default; ///< Move constructor
/**
* @brief Custom un-incrementable input iterator for the convenience of `find` operations.
* @brief Replaces the contents of the storage with another storage.
*
* @note This iterator is for read only and NOT incrementable.
* @return Reference of the current storage object
*/
struct iterator;
using const_iterator = iterator const; ///< Const forward iterator type
bucket_storage& operator=(bucket_storage&&) = default;
~bucket_storage() = default; ///< Destructor

bucket_storage(bucket_storage const&) = delete;
bucket_storage& operator=(bucket_storage const&) = delete;

/**
* @brief Returns an iterator to one past the last slot.
* @brief Gets buckets array.
*
* This is provided for convenience for those familiar with checking
* an iterator returned from `find()` against the `end()` iterator.
*
* @return An iterator to one past the last slot
* @return Pointer to the first bucket
*/
[[nodiscard]] __device__ constexpr iterator end() noexcept;
[[nodiscard]] constexpr bucket_type* data() const noexcept;

/**
* @brief Returns a const_iterator to one past the last slot.
*
* This is provided for convenience for those familiar with checking
* an iterator returned from `find()` against the `end()` iterator.
* @brief Gets the storage allocator.
*
* @return A const_iterator to one past the last slot
* @return The storage allocator
*/
[[nodiscard]] __device__ constexpr const_iterator end() const noexcept;
[[nodiscard]] constexpr allocator_type allocator() const noexcept;

/**
* @brief Gets windows array.
* @brief Gets bucket storage reference.
*
* @return Pointer to the first window
* @return Reference of bucket storage
*/
[[nodiscard]] __device__ constexpr window_type* data() noexcept;
[[nodiscard]] constexpr ref_type ref() const noexcept;

/**
* @brief Gets windows array.
* @brief Initializes each slot in the bucket storage to contain `key`.
*
* @return Pointer to the first window
* @param key Key to which all keys in `slots` are initialized
* @param stream Stream used for executing the kernel
*/
[[nodiscard]] __device__ constexpr window_type* data() const noexcept;
void initialize(value_type key, cuda::stream_ref stream = {});

/**
* @brief Returns an array of slots (or a window) for a given index.
* @brief Asynchronously initializes each slot in the bucket storage to contain `key`.
*
* @param index Index of the window
* @return An array of slots
* @param key Key to which all keys in `slots` are initialized
* @param stream Stream used for executing the kernel
*/
[[nodiscard]] __device__ constexpr window_type operator[](size_type index) const noexcept;
void initialize_async(value_type key, cuda::stream_ref stream = {}) noexcept;

private:
window_type* windows_; ///< Pointer to the windows array
allocator_type allocator_; ///< Allocator used to (de)allocate buckets
bucket_deleter_type bucket_deleter_; ///< Custom buckets deleter
/// Pointer to the bucket storage
std::unique_ptr<bucket_type, bucket_deleter_type> buckets_;
};

/// Alias for bucket_storage_ref
template <typename T, int32_t BucketSize, typename Extent = cuco::extent<std::size_t>>
using aow_storage_ref = bucket_storage_ref<T, BucketSize, Extent>;

/// Alias for bucket_storage
template <typename T,
int32_t BucketSize,
typename Extent = cuco::extent<std::size_t>,
typename Allocator = cuco::cuda_allocator<cuco::bucket<T, BucketSize>>>
using aow_storage = bucket_storage<T, BucketSize, Extent, Allocator>;

} // namespace cuco

#include <cuco/detail/storage/aow_storage.inl>
#include <cuco/detail/storage/bucket_storage.inl>
2 changes: 1 addition & 1 deletion include/cuco/detail/equal_wrapper.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ struct equal_wrapper {
*
* @note This function always compares the right-hand side element against sentinel values first
* then performs a equality check with the given `equal_` callable, i.e., `equal_(lhs, rhs)`.
* @note Container (like set or map) buckets MUST be always on the right-hand side.
* @note Container (like set or map) slots MUST be always on the right-hand side.
*
* @tparam IsInsert Flag indicating whether it's an insert equality check or not. Insert probing
* stops when it's an empty or erased slot while query probing stops only when it's empty.
Expand Down
Loading

0 comments on commit 93d6172

Please sign in to comment.