Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data structure base classes and cuco::static_set #278

Merged
merged 236 commits into from
Apr 6, 2023
Merged
Show file tree
Hide file tree
Changes from 215 commits
Commits
Show all changes
236 commits
Select commit Hold shift + click to select a range
d4a4a48
Add storage class
PointKernel Aug 3, 2022
2afb416
Add static_set class
PointKernel Aug 3, 2022
8413375
Update static_set class
PointKernel Aug 4, 2022
0ee3dbb
Update static_set & storage classes
PointKernel Aug 5, 2022
1d86b04
Use value_type instead of deprecated pointer
PointKernel Aug 5, 2022
eb23f76
Add utility unit test
PointKernel Aug 8, 2022
108893b
Merge remote-tracking branch 'upstream/dev' into grand-refactor
PointKernel Aug 8, 2022
f796e9c
Add a file for common kernels
PointKernel Aug 8, 2022
a9b7106
Add AOS storage view
PointKernel Aug 9, 2022
29a91b4
Add experimental probing scheme
PointKernel Aug 9, 2022
1187046
Add first static_set test
PointKernel Aug 9, 2022
dffe113
Merge remote-tracking branch 'upstream/dev' into grand-refactor
PointKernel Aug 10, 2022
b6aa668
Add static_set_ref
PointKernel Aug 15, 2022
0c11e98
Use probing iterator logic
PointKernel Aug 15, 2022
fdc3db1
Add extent class
PointKernel Aug 24, 2022
fb35c3b
Use extent instead of naive size type
PointKernel Aug 24, 2022
a858574
Minor cleanup
PointKernel Aug 24, 2022
2f0b69a
Update probing scheme to incorporate template size type
PointKernel Aug 24, 2022
1c230f5
Merge remote-tracking branch 'upstream/dev' into grand-refactor
PointKernel Aug 30, 2022
b2459f7
Include cuda std array header
PointKernel Aug 30, 2022
45441a3
Merge remote-tracking branch 'upstream/dev' into grand-refactor
PointKernel Aug 30, 2022
ba5c609
Fix a bug: pass template parameter when return probing iterator
PointKernel Aug 30, 2022
ff704c2
Move new implementations to experimental namespace
PointKernel Aug 30, 2022
1637cff
Add equality wrapper
PointKernel Aug 31, 2022
7cd8a0d
Add simple insert
PointKernel Aug 31, 2022
79d669d
Doc cleanups
PointKernel Aug 31, 2022
fd74ede
Pass scope to device ref
PointKernel Aug 31, 2022
1e500a7
Fix postfix increment overloading
PointKernel Aug 31, 2022
2f5089c
Minor cleanups
PointKernel Aug 31, 2022
d19f374
Use size_type + fix a bug in double hashing
PointKernel Sep 1, 2022
3c278dc
Add contains
PointKernel Sep 1, 2022
e9a77df
Fix a bug in contains
PointKernel Sep 1, 2022
1d52f18
Add CG device insert
PointKernel Sep 2, 2022
c8a2554
Fix a minor bug: use SizeType consistently
PointKernel Sep 2, 2022
6172558
Add overload for CG probing iterator
PointKernel Sep 6, 2022
33f0e8d
Add CG insert
PointKernel Sep 8, 2022
28a4e37
Fix a bug
PointKernel Sep 8, 2022
b769889
Add CG contains
PointKernel Sep 8, 2022
545e062
Add CG insert and contains kernels
PointKernel Sep 8, 2022
58c19cc
Improve CG implementations
PointKernel Sep 9, 2022
d927c83
Merge remote-tracking branch 'upstream/dev' into grand-refactor
PointKernel Sep 28, 2022
65fcb59
Remove window size logic from probing scheme
PointKernel Sep 29, 2022
4e7b389
Update static_set: move window size to storage
PointKernel Sep 29, 2022
b67d7b9
Add get_num_windows helper function
PointKernel Sep 30, 2022
5045fa9
Update static_set to use array of window logic
PointKernel Sep 30, 2022
16a5f10
Move get window logic back to storage class
PointKernel Oct 3, 2022
dd4a18c
Fix a bug in probing iterator: the upper bound is determined by num_w…
PointKernel Oct 3, 2022
61c180a
Get rid of size update in insert
PointKernel Oct 3, 2022
d3f233b
Minor cleanups
PointKernel Oct 3, 2022
4d44106
Fix a bug in CG implementation
PointKernel Oct 4, 2022
7611be4
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Dec 7, 2022
39e7275
Use global macro definitions
sleeepyjack Nov 2, 2022
4ef18c4
Remove unused counter
sleeepyjack Nov 2, 2022
ebca354
Add TODOs to storage.cuh
sleeepyjack Nov 17, 2022
4dedf4b
Rename defaults.cuh -> tuning.cuh
sleeepyjack Nov 23, 2022
520ac86
Add missing header
sleeepyjack Nov 23, 2022
2644f96
Move equal_wrapper to its own file
sleeepyjack Nov 23, 2022
715c8d4
Add mixin helpers and expose cuco::function interface
sleeepyjack Nov 23, 2022
637b763
Add base class for open addressing-based container references
sleeepyjack Nov 23, 2022
6f2cb1a
Switch to new function mixin design
sleeepyjack Nov 23, 2022
ae1224d
Rename open_address_ref -> open_address_container_ref and move to its…
sleeepyjack Nov 24, 2022
4ff8dd3
Add static_set host bulk example
sleeepyjack Dec 1, 2022
f986208
Remove stray comment
sleeepyjack Dec 1, 2022
e951745
Define public type alias for reference type
sleeepyjack Dec 1, 2022
a6508df
Add missing host/device specifiers
sleeepyjack Dec 1, 2022
e54ddf1
Add static_set device reference example
sleeepyjack Dec 1, 2022
9dade39
Remove open_address_container for now
sleeepyjack Dec 2, 2022
5dff30e
Fix storage unit test
PointKernel Dec 7, 2022
6eceac1
Rename reference->ref and functions->operators
sleeepyjack Dec 7, 2022
142df0b
Fix key_equal member type
sleeepyjack Dec 7, 2022
fba7d01
Fix copyright year
sleeepyjack Dec 7, 2022
e14d2f1
Rename window_storage->storage
sleeepyjack Dec 7, 2022
56086a5
Refactor with() and ref_with()
sleeepyjack Dec 7, 2022
55134fe
Use move semantics for with()
sleeepyjack Dec 7, 2022
2bc4d35
More compact declaration of operator tags
sleeepyjack Dec 8, 2022
63ed50f
Fix typo in example
sleeepyjack Dec 9, 2022
15f2752
CTAD is my friend
sleeepyjack Dec 9, 2022
443ce4a
Remove TODOs and fix some comments
sleeepyjack Dec 9, 2022
03adc20
Warning for concurrent usage of multiple ref objects
sleeepyjack Dec 9, 2022
5bcec13
Merge branch 'dev' into refactor and enable constexpr prime capacity
sleeepyjack Dec 9, 2022
bddc704
Add failure case when the requested capacity is larger than what the …
sleeepyjack Dec 9, 2022
f2cd41b
Fix integer conversion and prime array size
sleeepyjack Dec 9, 2022
497396d
Remove default value for Extent
PointKernel Dec 14, 2022
b590efa
Remove redundant const
PointKernel Dec 14, 2022
7d16b34
Fix a typo in comment
PointKernel Dec 15, 2022
c2e20ad
Cleanups: const, sentinel namespace, etc
PointKernel Dec 28, 2022
eb5ab7b
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Jan 10, 2023
93c6ba7
Use public hasher
PointKernel Jan 10, 2023
56a6556
Add public aow storage facade
PointKernel Jan 13, 2023
4aa75fc
Add intermediate storage class
PointKernel Jan 13, 2023
80d0a9d
Remove prime test
PointKernel Jan 13, 2023
38d0932
Add extent test
PointKernel Jan 13, 2023
e41894b
Remove unnecessary const
PointKernel Jan 16, 2023
cbc5994
Add valid_extent function
PointKernel Jan 17, 2023
362862c
Update valid_extent computation to include CG size
PointKernel Jan 17, 2023
e8416f7
Update probing scheme and storage to use extent type for sizes
PointKernel Jan 19, 2023
57d3f61
Make static extent work with static_set
PointKernel Jan 19, 2023
ab1820b
Cleanups: constexpr, nodiscard, explicit
PointKernel Jan 19, 2023
0c356ec
Add back default extent type in set
PointKernel Jan 19, 2023
49e3fa6
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Jan 24, 2023
85dabaf
Migrate tests to Catch2 v3
PointKernel Jan 24, 2023
8526ef4
Make capacity constexpr
PointKernel Jan 24, 2023
c9e3489
Add set capacity tests
PointKernel Jan 24, 2023
d8b3bc2
Make ref capacity constexpr
PointKernel Jan 24, 2023
854a404
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Jan 26, 2023
c2619d0
Add size funtion to static_set
PointKernel Jan 26, 2023
64b79eb
Create and use aow_storage_base class
PointKernel Jan 26, 2023
ca0d2d3
Use cub sum instead of thrust reduce to avoid unexpected memory (de)a…
PointKernel Jan 27, 2023
b22aa76
Move function details to inl file
PointKernel Jan 27, 2023
03610b3
Move set_ref details to inl file
PointKernel Jan 30, 2023
3fcd657
Fix loop overflow issue and add large input test
PointKernel Jan 30, 2023
d1e02dc
Update occupancy in large input tests
PointKernel Jan 30, 2023
128b614
Add heterogeneous lookup tests
PointKernel Jan 30, 2023
3cf460b
Merge branch 'dev' into refactor
PointKernel Jan 31, 2023
d1ade06
Add TODO reminding of performance investigation
PointKernel Feb 1, 2023
a5a884a
Make equal_wrappers constexpr
PointKernel Feb 1, 2023
f0c666b
Remove redundant const
PointKernel Feb 1, 2023
6b0cf3e
Make variables const when possible
PointKernel Feb 1, 2023
4fdb76b
Use non-CG algos when CGSize equals 1 + add tests
PointKernel Feb 1, 2023
bce09c8
Move probing scheme details into inl file
PointKernel Feb 2, 2023
b56f507
Use int32_t instead of int
PointKernel Feb 2, 2023
7caab6e
Add public linear probing class
PointKernel Feb 2, 2023
ee51c9e
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Feb 2, 2023
7e8c011
Add linear probing + tests
PointKernel Feb 3, 2023
9e54f48
Add TODO reminding of future improvement
PointKernel Feb 3, 2023
a222b08
Get rid of sentinel namespace
PointKernel Feb 3, 2023
e3fbd66
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Feb 3, 2023
8f54090
Add set benchmarks
PointKernel Feb 5, 2023
da43c3c
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Feb 5, 2023
f2a33bb
Use new default parameters for benchmarks
PointKernel Feb 7, 2023
4696114
Update gitignore file to ignore eps figures
PointKernel Feb 7, 2023
897efae
Rewrite static map benchmarks with nvbench
PointKernel Feb 7, 2023
eaad5da
Use new parameters
PointKernel Feb 8, 2023
348352f
Add new key distribution generator to be used in benchmarks
sleeepyjack Feb 8, 2023
f7eb02c
Move multimap insert bench to new key generator design
sleeepyjack Feb 8, 2023
3917e56
Merge remote-tracking branch 'danielj/key-gen' into static-map-nvbench
PointKernel Feb 8, 2023
163ca96
Introduce benchmark namespace and add new header for default parameters
sleeepyjack Feb 9, 2023
d1e90b9
Add nvbench type strings for random distribution types
sleeepyjack Feb 9, 2023
ccf10d9
Set default table occupancy to 0.5
sleeepyjack Feb 9, 2023
214a669
Merge remote-tracking branch 'danielj/key-gen' into static-map-nvbench
PointKernel Feb 9, 2023
05b5015
Change default parameters and remove unused include
sleeepyjack Feb 9, 2023
cbcb8ad
Use default parameter as fallback for unspecified axes
sleeepyjack Feb 9, 2023
1dc7a22
Port count benchmark to use the new key generator
sleeepyjack Feb 9, 2023
ebe0be6
Refactor static map benchmarks with new key generator
PointKernel Feb 9, 2023
cdddb75
Merge remote-tracking branch 'danielj/key-gen' into static-map-nvbench
PointKernel Feb 9, 2023
b20b739
Clean up benchmark test cases
PointKernel Feb 9, 2023
d3dce19
Apply suggestions from code review
PointKernel Feb 10, 2023
b911531
Rename benchmarks
PointKernel Feb 10, 2023
b6edcc7
Merge branch 'static-map-nvbench' of github.com:PointKernel/cuCollect…
PointKernel Feb 10, 2023
349ba53
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Feb 10, 2023
d60d692
Use get or default getter
PointKernel Feb 10, 2023
f79fc62
Merge branch 'static-map-nvbench' of github.com:PointKernel/cuCollect…
PointKernel Feb 10, 2023
f4bf3d6
Remove redundant doc
PointKernel Feb 10, 2023
75f8cc3
Merge branch 'static-map-nvbench' into refactor
PointKernel Feb 13, 2023
02b391b
Update benchmarks with new key generators
PointKernel Feb 13, 2023
201866e
Fix a bug in set benchmark
PointKernel Feb 14, 2023
70d745a
Minor improvement: use key equal to determine duplicates
PointKernel Feb 14, 2023
1bf5af9
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Feb 21, 2023
4cbb4fe
Workaround for performance regression with atomic_ref
PointKernel Feb 21, 2023
9e8e6d6
Update capacity tests
PointKernel Feb 22, 2023
7b174b8
Merge remote-tracking branch 'upstream/dev' into refactor
sleeepyjack Feb 23, 2023
48db794
Fix static_set ctor with default arguments
sleeepyjack Feb 24, 2023
4748403
Parity for static_set benchmark
sleeepyjack Feb 24, 2023
77b14f0
Move size computation to static_set
sleeepyjack Mar 10, 2023
352ab5c
Add insert/insert_async functionality
sleeepyjack Mar 10, 2023
1d05803
Add test for size computation
sleeepyjack Mar 10, 2023
033ac8c
Add benchmarks for size computation
sleeepyjack Mar 10, 2023
0d47786
Merge branch 'refactor' of github.com:NVIDIA/cuCollections into refactor
PointKernel Mar 14, 2023
19175fa
Skip large input test when needed
PointKernel Mar 14, 2023
cd6ae9d
Merge remote-tracking branch 'upstream/dev' into refactor
PointKernel Mar 14, 2023
0682444
Fix typos and resolve conflicts in CMake
PointKernel Mar 14, 2023
c731f2c
[TEST]
PointKernel Mar 14, 2023
431dbd9
[TEST]
PointKernel Mar 14, 2023
756bf56
Apply suggestions from code review
PointKernel Mar 14, 2023
4cd2e74
Fix a typo
PointKernel Mar 14, 2023
0a34a0c
Ignore skipped tests
PointKernel Mar 15, 2023
750626b
Revert debugging changes
PointKernel Mar 15, 2023
5268e3c
Update include/cuco/detail/storage.cuh
PointKernel Mar 15, 2023
080a47c
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Mar 15, 2023
70cf3c4
Add contains_async
sleeepyjack Mar 17, 2023
465038e
Merge remote-tracking branch 'upstream/dev' into refactor
sleeepyjack Mar 17, 2023
a28338f
Fix after merge
sleeepyjack Mar 17, 2023
914c193
Merge branch 'refactor' of github.com:NVIDIA/cuCollections into refactor
PointKernel Mar 27, 2023
e99faac
Enforce set ref to be constructed with at least one operator
PointKernel Mar 27, 2023
5eae984
Rename ref_with as ref
PointKernel Mar 28, 2023
39e8c11
Throw for invalid input extent
PointKernel Mar 28, 2023
f39fc0a
Fix a minor type conversion bug
PointKernel Mar 28, 2023
bb6ce5f
Add thread scope handling to CAS
PointKernel Mar 28, 2023
01d0705
Minor cleanup
PointKernel Mar 28, 2023
4a4e904
Replace valid_extent member function with free valid extent factory
PointKernel Mar 29, 2023
1ece70b
Rename set device ref example + add set examples to README
PointKernel Mar 29, 2023
5414d21
Fix docs
sleeepyjack Mar 30, 2023
fd20f26
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Mar 30, 2023
4198c66
Apply suggestions from code review
PointKernel Mar 31, 2023
8dd5bca
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Mar 31, 2023
56ec864
Minor doc fix
PointKernel Mar 31, 2023
0a3b3bd
Remove inline keyword for member functions
PointKernel Mar 31, 2023
372c705
Add TODO reminder
PointKernel Mar 31, 2023
464a50b
Use dependent_false instead of custom supports_operator
PointKernel Mar 31, 2023
475d2f1
Move detail storage to its own folder
PointKernel Mar 31, 2023
ce7f1c5
Make probing iterator members host and device invocable
PointKernel Mar 31, 2023
d6dc9ab
Renaming
PointKernel Mar 31, 2023
7ba244b
Apply suggestions from code review
PointKernel Apr 1, 2023
6e7fba6
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Apr 1, 2023
cfaaea2
Merge branch 'refactor' of github.com:NVIDIA/cuCollections into refactor
PointKernel Apr 1, 2023
7d8cce3
Fix build warnings
PointKernel Apr 3, 2023
09cbd34
Use counter storage instead of raw counter (de)allocation
PointKernel Apr 3, 2023
713435b
Rename aow_storage::windows member as aow_storage::data
PointKernel Apr 3, 2023
846560e
Use bracket operator instead of window getter
PointKernel Apr 3, 2023
cb5545b
Update docs
PointKernel Apr 3, 2023
5c8c915
Update docs
PointKernel Apr 3, 2023
40c3704
Update docs + rename scope as thread_scope
PointKernel Apr 3, 2023
00c4474
Apply suggestions from code review
PointKernel Apr 4, 2023
0f17600
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] Apr 4, 2023
3c92839
Add guard against unsupported thread scope
PointKernel Apr 4, 2023
833cd5a
Guard with nodiscard attribute
PointKernel Apr 4, 2023
0b22a50
Update include/cuco/detail/static_set/static_set_ref.inl
PointKernel Apr 4, 2023
a873c37
Merge branch 'refactor' of github.com:NVIDIA/cuCollections into refactor
PointKernel Apr 4, 2023
6b62cde
Apply suggestions from code review
PointKernel Apr 4, 2023
ec492e3
Apply suggestions from code review
PointKernel Apr 4, 2023
2c67e13
Update docs
PointKernel Apr 4, 2023
f7c2efb
Update probing scheme docs + avoid secondary clustering
PointKernel Apr 4, 2023
0402219
Update static set docs
PointKernel Apr 5, 2023
900b682
Update static_set_ref docs
PointKernel Apr 5, 2023
a7bdfac
Update docs: address review comments
PointKernel Apr 5, 2023
03f0a4d
Fix typo in docs
sleeepyjack Apr 5, 2023
7bf0201
Update docs: add notes about CGSize
PointKernel Apr 5, 2023
7a4a6dc
Merge branch 'refactor' of github.com:NVIDIA/cuCollections into refactor
PointKernel Apr 5, 2023
342d89a
Add static assert to static_set_ref
PointKernel Apr 5, 2023
5b33960
Update equal wrapper docs
PointKernel Apr 5, 2023
a555f07
Update copyright year
PointKernel Apr 5, 2023
6a7d6f2
Fix a typo in operator impl docs
PointKernel Apr 5, 2023
77a275b
Move TODO comments to avoid confusing clang-format
PointKernel Apr 5, 2023
25ecb90
Update storage docs
PointKernel Apr 5, 2023
3213a1e
Update set and set_ref docs
PointKernel Apr 5, 2023
250c159
Formatting
PointKernel Apr 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,14 @@ class example_class {

We plan to add many GPU-accelerated, concurrent data structures to `cuCollections`. As of now, the two flagships are variants of hash tables.

### `static_set`

`cuco::static_set` is a fixed-size container that stores unique elements in no particular order. See the Doxygen documentation in `static_set.cuh` for more detailed information.

#### Examples:
- [Host-bulk APIs](https://github.com/NVIDIA/cuCollections/blob/dev/examples/static_set/host_bulk_example.cu) (see [live example in godbolt](https://godbolt.org/z/jnjcdG16c))
- [Device-ref APIs for individual operations](https://github.com/NVIDIA/cuCollections/blob/dev/examples/static_set/device_ref_example.cu) (see [live example in godbolt](https://godbolt.org/z/EGMj6qx73))

### `static_map`

`cuco::static_map` is a fixed-size hash table using open addressing with linear probing. See the Doxygen documentation in `static_map.cuh` for more detailed information.
Expand Down
7 changes: 7 additions & 0 deletions benchmarks/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,13 @@ endfunction(ConfigureBench)
### benchmark sources #############################################################################
###################################################################################################

###################################################################################################
# - static_set benchmarks -------------------------------------------------------------------------
ConfigureBench(STATIC_SET_BENCH
hash_table/static_set/contains_bench.cu
hash_table/static_set/insert_bench.cu
hash_table/static_set/size_bench.cu)

###################################################################################################
# - static_map benchmarks -------------------------------------------------------------------------
ConfigureBench(STATIC_MAP_BENCH
Expand Down
75 changes: 75 additions & 0 deletions benchmarks/hash_table/static_set/contains_bench.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
/*
* Copyright (c) 2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <defaults.hpp>
#include <utils.hpp>

#include <cuco/static_set.cuh>
#include <cuco/utility/key_generator.hpp>

#include <nvbench/nvbench.cuh>

#include <thrust/device_vector.h>

using namespace cuco::benchmark;
using namespace cuco::utility;

/**
* @brief A benchmark evaluating `cuco::static_set::contains` performance
*/
template <typename Key, typename Dist>
void static_set_contains(nvbench::state& state, nvbench::type_list<Key, Dist>)
{
auto const num_keys = state.get_int64_or_default("NumInputs", defaults::N);
auto const occupancy = state.get_float64_or_default("Occupancy", defaults::OCCUPANCY);
auto const matching_rate = state.get_float64_or_default("MatchingRate", defaults::MATCHING_RATE);

std::size_t const size = num_keys / occupancy;

thrust::device_vector<Key> keys(num_keys);

key_generator gen;
gen.generate(dist_from_state<Dist>(state), keys.begin(), keys.end());

cuco::experimental::static_set<Key> set{size, cuco::empty_key<Key>{-1}};
set.insert(keys.begin(), keys.end());

gen.dropout(keys.begin(), keys.end(), matching_rate);

thrust::device_vector<bool> result(num_keys);

state.add_element_count(num_keys);

state.exec(nvbench::exec_tag::sync, [&](nvbench::launch& launch) {
set.contains(keys.begin(), keys.end(), result.begin(), launch.get_stream());
});
}

NVBENCH_BENCH_TYPES(static_set_contains,
NVBENCH_TYPE_AXES(defaults::KEY_TYPE_RANGE,
nvbench::type_list<distribution::unique>))
.set_name("static_set_contains_unique_occupancy")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_float64_axis("Occupancy", defaults::OCCUPANCY_RANGE);

NVBENCH_BENCH_TYPES(static_set_contains,
NVBENCH_TYPE_AXES(defaults::KEY_TYPE_RANGE,
nvbench::type_list<distribution::unique>))
.set_name("static_set_contains_unique_matching_rate")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_float64_axis("MatchingRate", defaults::MATCHING_RATE_RANGE);
81 changes: 81 additions & 0 deletions benchmarks/hash_table/static_set/insert_bench.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
/*
* Copyright (c) 2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <defaults.hpp>
#include <utils.hpp>

#include <cuco/static_set.cuh>
#include <cuco/utility/key_generator.hpp>

#include <nvbench/nvbench.cuh>

#include <thrust/device_vector.h>

using namespace cuco::benchmark;
using namespace cuco::utility;

/**
* @brief A benchmark evaluating `cuco::static_set::insert` performance
*/
template <typename Key, typename Dist>
void static_set_insert(nvbench::state& state, nvbench::type_list<Key, Dist>)
{
auto const num_keys = state.get_int64_or_default("NumInputs", defaults::N);
auto const occupancy = state.get_float64_or_default("Occupancy", defaults::OCCUPANCY);

std::size_t const size = num_keys / occupancy;

thrust::device_vector<Key> keys(num_keys);

key_generator gen;
gen.generate(dist_from_state<Dist>(state), keys.begin(), keys.end());

state.add_element_count(num_keys);

state.exec(nvbench::exec_tag::sync | nvbench::exec_tag::timer,
[&](nvbench::launch& launch, auto& timer) {
cuco::experimental::static_set<Key> set{
size, cuco::empty_key<Key>{-1}, {}, {}, {}, launch.get_stream()};

timer.start();
set.insert(keys.begin(), keys.end(), launch.get_stream());
timer.stop();
});
}

NVBENCH_BENCH_TYPES(static_set_insert,
NVBENCH_TYPE_AXES(defaults::KEY_TYPE_RANGE,
nvbench::type_list<distribution::uniform>))
.set_name("static_set_insert_uniform_multiplicity")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_int64_axis("Multiplicity", defaults::MULTIPLICITY_RANGE);

NVBENCH_BENCH_TYPES(static_set_insert,
NVBENCH_TYPE_AXES(defaults::KEY_TYPE_RANGE,
nvbench::type_list<distribution::unique>))
.set_name("static_set_insert_unique_occupancy")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_float64_axis("Occupancy", defaults::OCCUPANCY_RANGE);

NVBENCH_BENCH_TYPES(static_set_insert,
NVBENCH_TYPE_AXES(defaults::KEY_TYPE_RANGE,
nvbench::type_list<distribution::gaussian>))
.set_name("static_set_insert_gaussian_skew")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_float64_axis("Skew", defaults::SKEW_RANGE);
62 changes: 62 additions & 0 deletions benchmarks/hash_table/static_set/size_bench.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
/*
* Copyright (c) 2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <defaults.hpp>
#include <utils.hpp>

#include <cuco/static_set.cuh>
#include <cuco/utility/key_generator.hpp>

#include <nvbench/nvbench.cuh>

#include <thrust/device_vector.h>

using namespace cuco::benchmark;
using namespace cuco::utility;

/**
* @brief A benchmark evaluating `cuco::static_set::size` performance
*/
template <typename Key, typename Dist>
void static_set_size(nvbench::state& state, nvbench::type_list<Key, Dist>)
{
auto const num_keys = state.get_int64_or_default("NumInputs", defaults::N);
auto const occupancy = state.get_float64_or_default("Occupancy", defaults::OCCUPANCY);

std::size_t const size = num_keys / occupancy;

thrust::device_vector<Key> keys(num_keys);

key_generator gen;
gen.generate(dist_from_state<Dist>(state), keys.begin(), keys.end());

state.add_element_count(num_keys);

cuco::experimental::static_set<Key> set{size, cuco::empty_key<Key>{-1}};

set.insert(keys.begin(), keys.end());

state.exec(nvbench::exec_tag::sync,
[&](nvbench::launch& launch) { auto const size = set.size(launch.get_stream()); });
}

NVBENCH_BENCH_TYPES(static_set_size,
NVBENCH_TYPE_AXES(defaults::KEY_TYPE_RANGE,
nvbench::type_list<distribution::unique>))
.set_name("static_set_size_unique_occupancy")
.set_type_axes_names({"Key", "Distribution"})
.set_max_noise(defaults::MAX_NOISE)
.add_float64_axis("Occupancy", defaults::OCCUPANCY_RANGE);
2 changes: 2 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ endfunction(ConfigureExample)
### Example sources ###############################################################################
###################################################################################################

ConfigureExample(STATIC_SET_HOST_BULK_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/static_set/host_bulk_example.cu")
ConfigureExample(STATIC_SET_DEVICE_REF_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/static_set/device_ref_example.cu")
ConfigureExample(STATIC_MAP_HOST_BULK_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/static_map/host_bulk_example.cu")
ConfigureExample(STATIC_MAP_DEVICE_SIDE_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/static_map/device_view_example.cu")
ConfigureExample(STATIC_MAP_CUSTOM_TYPE_EXAMPLE "${CMAKE_CURRENT_SOURCE_DIR}/static_map/custom_type_example.cu")
Expand Down
116 changes: 116 additions & 0 deletions examples/static_set/device_ref_example.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <cuco/static_set.cuh>

#include <thrust/device_vector.h>
#include <thrust/functional.h>
#include <thrust/logical.h>
#include <thrust/sequence.h>

#include <cooperative_groups.h>

#include <cstddef>
#include <iostream>

// insert a set of keys into a hash set using one cooperative group for each task
template <typename SetRef, typename InputIterator>
__global__ void custom_cooperative_insert(SetRef set, InputIterator keys, std::size_t n)
{
namespace cg = cooperative_groups;

constexpr auto cg_size = SetRef::cg_size;

auto tile = cg::tiled_partition<cg_size>(cg::this_thread_block());

int64_t const loop_stride = gridDim.x * blockDim.x / cg_size;
int64_t idx = (blockDim.x * blockIdx.x + threadIdx.x) / cg_size;

while (idx < n) {
set.insert(tile, *(keys + idx));
idx += loop_stride;
}
}

PointKernel marked this conversation as resolved.
Show resolved Hide resolved
template <typename SetRef, typename InputIterator, typename OutputIterator>
__global__ void custom_contains(SetRef set, InputIterator keys, std::size_t n, OutputIterator found)
{
int64_t const loop_stride = gridDim.x * blockDim.x;
int64_t idx = blockDim.x * blockIdx.x + threadIdx.x;

while (idx < n) {
found[idx] = set.contains(*(keys + idx));
idx += loop_stride;
}
}

/**
* @file device_reference_example.cu
* @brief Demonstrates usage of the static_set device-side APIs.
*
* static_set provides a non-owning reference which can be used to interact with
* the container from within device code.
*
*/
int main(void)
{
using Key = int;

// Empty slots are represented by reserved "sentinel" values. These values should be selected such
// that they never occur in your input data.
Key constexpr empty_key_sentinel = -1;

// Number of keys to be inserted
std::size_t constexpr num_keys = 50'000;

// Compute capacity based on a 50% load factor
auto constexpr load_factor = 0.5;
std::size_t const capacity = std::ceil(num_keys / load_factor);

using set_type = cuco::experimental::static_set<Key>;

// Constructs a hash set with at least "capacity" slots using -1 as the empty key sentinel.
set_type set{capacity, cuco::empty_key{empty_key_sentinel}};

// Create a sequence of keys {0, 1, 2, .., i}
thrust::device_vector<Key> keys(num_keys);
thrust::sequence(keys.begin(), keys.end(), 0);

// Insert the first half of the keys into the set
set.insert(keys.begin(), keys.begin() + num_keys / 2);

// Insert the second half of keys using a custom CUDA kernel.
custom_cooperative_insert<<<128, 128>>>(
set.ref(cuco::experimental::insert), keys.begin() + num_keys / 2, num_keys / 2);

// Storage for result
thrust::device_vector<bool> found(num_keys);

// Check if all keys are now contained in the set. Note that we pass a reference that already has
// the `contains` operator.
// In general, using two or more reference objects to the same container but with
// a different set of operators concurrently is undefined behavior.
// This does not apply here since the two kernels do not overlap.
custom_contains<<<128, 128>>>(
set.ref(cuco::experimental::contains), keys.begin(), num_keys, found.begin());

// Verify that all keys have been found
bool const all_keys_found = thrust::all_of(found.begin(), found.end(), thrust::identity<bool>());

if (all_keys_found) { std::cout << "Success! Found all keys.\n"; }

return 0;
}
Loading