-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] row_comparators should use strongly typed index types to ensure commutativity #10508
Comments
I suggest alongside implementing this change we update an existing algorithm to use the new strongly typed logic.
Lines 143 to 180 in 57ff6f5
Using strong types for the input indices would be a good way to simplify the logic. |
|
Thanks for writing this issue up! I think I've left inline comments about this in a couple of places and we've discussed possible solutions before, but I hadn't gotten around to writing up a full proposal. It may also be worthwhile to test with at least one algorithm using cuco. The changes associated with #10401 will likely run into these kinds noncommutativity bugs. |
Hey, turns out we had an issue for this a long time ago! #3257 I knew it sounded familiar :) |
This issue has been labeled |
This PR resolves #10508. It introduces two-table lexicographic row comparators with strongly typed index types. Given tables `lhs` and `rhs`, the `two_table_comparator` can create a device comparator whose strongly typed call operator can compare bidirectionally: `lhs[i] < rhs[j]` and `rhs[i] < lhs[j]`. The strong typing indicates which index belongs to which table. This PR also contains a sample implementation in `search_ordered.cu`, which implements `lower_bound` and `upper_bound` algorithms. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Nghia Truong (https://github.com/ttnghia) - Jake Hemstad (https://github.com/jrhemstad) URL: #10730
Is your feature request related to a problem? Please describe.
As @vyasr pointed out in #9666 (comment),
row_equality_comparator
androw_lexicographic_comparator
are not commutative, i.e.,op(i,j) !==> op(j,i)
. (Edit: Obviously<
is not commutative, but it still has the same kind of problem because there is implicit meaning to the values that will cause failures if violated)Both of these operators work on two tables, a
lhs
andrhs
. For two idsi
andj
op(i,j)
assumesi
references into rowi
oflhs
andj
intorhs
. Thus,op(i,j)
is not the same asop(j,i)
.For example, imagine doing a binary search for the rows of table
B
into tableA
This uses counting iterators in combination with the
row_lexicographic_comparator
to perform a binary search of rowi
fromB
intoA
.Invoking
less_compare(i,j)
it implicitly assumes thati
references tableA
(thelhs
) andj
references tableB
(therhs
). But there is absolutely zero guarantee from thebinary_search
algorithm that it would only use values from the first input range for the first argument and values from the second range for the second argument. In other words,less_compare(i,j)
could be invoked wherei
indexes into B andj
indexes into A. This would perform an incorrect comparison and silently produce incorrect results.Concrete example of this in action with
std::merge
: https://godbolt.org/z/j5z5oG87jDescribe the solution you'd like
We should leverage strong typing to solve this problem.
In short, define a strong type for a
lhs_index_t
andrhs_index_t
and provide two overloads of the row operators binaryoperator()
that accept arguments of these types in swapped ordering. These strongly typed overloads can simply unwrap the strong type wrapper and call an internal overload that works on justsize_t
indices:I don't believe a
thrust::counting_iterator
will work as-is with thelhs_index_t/rhs_index_t
as defined above. We may need to provide a convenience wrapper for making counting iterators for these strong types, or making them actual structs with appropriateoperator+
to make it work withcounting_iterator
by default.Describe alternatives you've considered
There really isn't an alternative. We've only survived this far by relying on implementation defined behavior that algorithms aren't switching the order of arguments on us.
Additional Context
Using the strongly typed indices will require modifying algorithms to use these new strong types. This change should be made in conjunction with the updating to the new comparators added in #10164.
This issue will block updating an algorithm that expects to perform comparisons between two different tables, e.g., #9452.
The text was updated successfully, but these errors were encountered: