-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix integer overflow in partition
scatter_map
construction
#13272
Conversation
Additionally, ensure that the partition_map has valid entries, otherwise one can end up with out of bounds memory reads in libcudf. Closes the part of rapidsai#7513 related to invalid inputs.
Although the input partition map may be any valid integral type, the intermediate scatter map should have the same type as a valid row index (and the offsets histogram), namely `size_type`. Previously, the scatter map was created with the same integer type as the partition map, which can result in integer overflow, and incorrect results, when the partition map is a narrow integral type and the input table has more rows than the width of the type. Closes rapidsai#7513.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving for C++
Merged trunk so hopefully the unrelated python test fails will be gone. |
I think the failed python tests are relevant:
|
Ah, I didn't think about nulls in my error checking. Will look tomorrow. |
OK, should be fixed... |
/merge |
Description
Although the input partition map may be any valid integral type, the
intermediate scatter map should have the same type as a valid row
index (and the offsets histogram), namely
size_type
. Previously, thescatter map was created with the same integer type as the partition
map, which can result in integer overflow, and incorrect results, when
the partition map is a narrow integral type and the input table has
more rows than the width of the type.
On the Python side, this adds validation of the user
partition_map input to ensure that all entries are in range,
avoiding potential out of bounds memory accesses.
Closes #7513.
Checklist