Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid overflow in fused_concatenate_kernel output_index (#10344)
Fixes #10333. The repro case in the issue showed an illegal access error where the `output_index` of the strided loop in `fused_concatenate_kernel` can overflow for a large number of rows. For example, given 5 tables of exactly 250M rows each we would expect a result with 1,250,000,000 rows. The kernel is launched with 4,882,813 blocks (# of rows / 256 threads rounded up) with a stride of 1,250,000,128 (256 * 4,882,813). When `output_index` reaches 897,483,520, it overflows `output_index` on the first iteration. The change below prevents the overflow by making `output_index` an `int64_t` and adds a test that shows that we can now concatenate up to `size_type::max - 1` rows. Authors: - Alessandro Bellina (https://github.com/abellina) Approvers: - Nghia Truong (https://github.com/ttnghia) - Jake Hemstad (https://github.com/jrhemstad) - MithunR (https://github.com/mythrocks) URL: #10344
- Loading branch information