-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add specialized dispatch to improve occupancy for hash table operations in distinct join #16321
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is a draft PR but I wanted to understand this work since it has come up in a few conversations. I left a few comments, and hope they are helpful.
probing_scheme_type, | ||
cudf::detail::cuco_allocator, | ||
cuco_storage_type>; | ||
using hash_table_type = std::variant< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we simplify this with something like:
template<typename Comparator>
using static_set_with_comparator = cuco::static_set<cuco::pair<hash_value_type, rhs_index_type>,
cuco::extent<size_type>,
cuda::thread_scope_device,
comparator_adapter<cudf::experimental::row::equality::
strong_index_comparator_adapter<Comparator>>,
probing_scheme_type,
cudf::detail::cuco_allocator,
cuco_storage_type>;
using hash_table_type = std::variant<
static_set_with_comparator<row_comparator>,
static_set_with_comparator<row_comparator_no_nested>,
static_set_with_comparator<row_comparator_no_compound>>;
/** | ||
* @brief Compares the specified elements for equality. | ||
* | ||
* is_equality_comparable differs from implementation for std::equality_comparable and considers | ||
* void as and equality comparable type. Thus we need to disable this for when type is void. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* void as and equality comparable type. Thus we need to disable this for when type is void. | |
* void as an equality comparable type. Thus we need to disable this for when type is void. |
cpp/src/join/distinct_hash_join.cu
Outdated
return std::visit( | ||
[&](auto& comparator) { | ||
return ret_type{ | ||
std::in_place_type< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this use in_place_type_t
and drop the ::type
?
cpp/src/join/distinct_hash_join.cu
Outdated
cuco::static_set<cuco::pair<hash_value_type, rhs_index_type>, | ||
cuco::extent<size_type>, | ||
cuda::thread_scope_device, | ||
typename std::remove_reference<decltype(comparator_adapter)>::type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typename std::remove_reference<decltype(comparator_adapter)>::type, | |
typename std::remove_reference_t<decltype(comparator_adapter)>, |
cpp/src/join/distinct_hash_join.cu
Outdated
[&](auto&& hasher, auto&& hash_table) { | ||
auto const iter = cudf::detail::make_counting_transform_iterator( | ||
0, | ||
build_keys_fn<typename std::remove_reference<decltype(hasher)>::type, rhs_index_type>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_keys_fn<typename std::remove_reference<decltype(hasher)>::type, rhs_index_type>{ | |
build_keys_fn<typename std::remove_reference_t<decltype(hasher)>, rhs_index_type>{ |
cpp/src/join/distinct_hash_join.cu
Outdated
[&](auto&& hasher, auto&& hash_table) { | ||
auto const iter = cudf::detail::make_counting_transform_iterator( | ||
0, | ||
build_keys_fn<typename std::remove_reference<decltype(hasher)>::type, lhs_index_type>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_keys_fn<typename std::remove_reference<decltype(hasher)>::type, lhs_index_type>{ | |
build_keys_fn<typename std::remove_reference_t<decltype(hasher)>, lhs_index_type>{ |
// used to circumvent conflicts between arrays of different types between | ||
// different template instantiations due to the extern specifier. | ||
extern __shared__ char raw_intermediate_storage[]; | ||
cudf::ast::detail::IntermediateDataType<has_nulls>* intermediate_storage = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be auto
since you're reinterpret-casting it to the appropriate type?
cudf::size_type const right_num_rows = right_table.num_rows(); | ||
auto const outer_num_rows = left_num_rows; | ||
|
||
cudf::size_type outer_row_index = threadIdx.x + blockIdx.x * block_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We avoid raw threadIdx
/ blockIdx
math. We have utility functions for abstracting this logic and ensuring safe types are used:
static constexpr thread_index_type global_thread_id(thread_index_type thread_id, |
See examples:
auto const start_idx = cudf::detail::grid_1d::global_thread_id<block_size>(); for (cudf::thread_index_type outer_row_index = start_idx; outer_row_index < outer_num_rows;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only separated out the instantiation of the kernels in the MR, and copied the previous implementation. Unsure if I should address it in this MR, maybe a followup would be better?
#include <cudf/utilities/span.hpp> | ||
|
||
#include <cub/cub.cuh> | ||
#include "mixed_join_kernel_semi_impl.cuh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be consistent on including the directory name. This file is next to mixed_join_common_utils.cuh
. See the developer guide on includes for more info.
#include "mixed_join_kernel_semi_impl.cuh" | |
#include "join/mixed_join_kernel_semi_impl.cuh" |
*/ | ||
|
||
#include "join/mixed_join_common_utils.cuh" | ||
#include "mixed_join_kernel_semi_impl.cuh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include "mixed_join_kernel_semi_impl.cuh" | |
#include "join/mixed_join_kernel_semi_impl.cuh" |
19944cf
to
448b14b
Compare
Thanks for the review! Closing this PR, this work is merged into #15700. Please take a look there for further review |
Description
Adds specialized dispatch for distinct hash join similar to that implemented for hash joins in #15700
Related issue: #15502
Checklist
Benchmark results on A100