Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix out-of-bounds memory read in orc gpuEncodeOrcColumnData (#9196)
Device memory read error found in `gpuEncodeOrcColumnData` when running `ORC_TEST` with `compute-sanitizer`. ``` [ RUN ] OrcChunkedWriterTest.LargeTables ========= Invalid __global__ read of size 4 bytes ========= at 0x8b0 in void cudf::io::orc::gpu::gpuEncodeOrcColumnData<int=512>(cudf::detail::base_2dspan<cudf::io::orc::gpu::EncChunk const ,cudf::device_span>,cudf::detail<cudf::io::orc::gpu::encoder_chunk_streams,cudf::io::orc::gpu::EncChunk const >) ========= by thread (60,0,0) in block (255,0,0) ========= Address 0x7fcd7a000000 is out of bounds ========= Saved host backtrace up to driver entry point at kernel launch time ... ``` The was in the `cudf::detail::get_mask_offset_word` utility which may need to read multiple `bitmask_type` values (4-bytes == 32-bits) to satisfy the begin/end bit parameters. The `source_end_bit` is intended to be exclusive but the logic inadvertently reads the next `bytemask_type` from the input `source` null-mask on boundary cases like the one found in the gtest above. Here the `source_begin_bit==480` and the `source_end_bit==512` and because `word_index(512) > word_index(480)` the next read access is out of bounds. This PR fixed the logic in the utility by ensuring only the inclusive bits are verified to require and extra read from `source`. The logic in `cudf::io::orc::gpu::gpuEncodeOrcColumnData` that calls this utility also required a fix where it always requested at least 32-bits regardless if it was out of bounds for `source`. This PR fixes the math logic to specify the correct end-bit value. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Nghia Truong (https://github.com/ttnghia) URL: #9196
- Loading branch information