Skip to content

Commit

Permalink
Fix out-of-bounds memory read in orc gpuEncodeOrcColumnData (#9196)
Browse files Browse the repository at this point in the history
Device memory read error found in `gpuEncodeOrcColumnData` when running `ORC_TEST` with `compute-sanitizer`.

```
[ RUN      ] OrcChunkedWriterTest.LargeTables
========= Invalid __global__ read of size 4 bytes
=========     at 0x8b0 in void cudf::io::orc::gpu::gpuEncodeOrcColumnData<int=512>(cudf::detail::base_2dspan<cudf::io::orc::gpu::EncChunk const ,cudf::device_span>,cudf::detail<cudf::io::orc::gpu::encoder_chunk_streams,cudf::io::orc::gpu::EncChunk const >)
=========     by thread (60,0,0) in block (255,0,0)
=========     Address 0x7fcd7a000000 is out of bounds
=========     Saved host backtrace up to driver entry point at kernel launch time
...
```

The was in the `cudf::detail::get_mask_offset_word` utility which may need to read multiple `bitmask_type` values (4-bytes == 32-bits) to satisfy the begin/end bit parameters. The `source_end_bit` is intended to be exclusive but the logic inadvertently reads the next `bytemask_type` from the input `source` null-mask on boundary cases like the one found in the gtest above. Here the `source_begin_bit==480` and the `source_end_bit==512` and because `word_index(512) > word_index(480)` the next read access is out of bounds. This PR fixed the logic in the utility by ensuring only the inclusive bits are verified to require and extra read from `source`. 

The logic in `cudf::io::orc::gpu::gpuEncodeOrcColumnData` that calls this utility also required a fix where it always requested at least 32-bits regardless if it was out of bounds for `source`. This PR fixes the math logic to specify the correct end-bit value.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Nghia Truong (https://github.com/ttnghia)

URL: #9196
  • Loading branch information
davidwendt authored Sep 15, 2021
1 parent 0cf2c5f commit 015f15c
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 7 deletions.
2 changes: 1 addition & 1 deletion cpp/include/cudf/column/column_device_view.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -1175,7 +1175,7 @@ __device__ inline bitmask_type get_mask_offset_word(bitmask_type const* __restri
size_type source_word_index = destination_word_index + word_index(source_begin_bit);
bitmask_type curr_word = source[source_word_index];
bitmask_type next_word = 0;
if (word_index(source_end_bit) >
if (word_index(source_end_bit - 1) >
word_index(source_begin_bit +
destination_word_index * detail::size_in_bits<bitmask_type>())) {
next_word = source[source_word_index + 1];
Expand Down
12 changes: 6 additions & 6 deletions cpp/src/io/orc/stripe_enc.cu
Original file line number Diff line number Diff line change
Expand Up @@ -687,12 +687,12 @@ __global__ void __launch_bounds__(block_size)
uint8_t valid = 0;
if (row < s->chunk.leaf_column->size()) {
if (s->chunk.leaf_column->nullable()) {
size_type current_valid_offset = row + s->chunk.leaf_column->offset();
size_type next_valid_offset =
current_valid_offset + min(32, s->chunk.leaf_column->size());

bitmask_type mask = cudf::detail::get_mask_offset_word(
s->chunk.leaf_column->null_mask(), 0, current_valid_offset, next_valid_offset);
auto const current_valid_offset = row + s->chunk.leaf_column->offset();
auto const last_offset =
min(current_valid_offset + 8,
s->chunk.leaf_column->offset() + s->chunk.leaf_column->size());
auto const mask = cudf::detail::get_mask_offset_word(
s->chunk.leaf_column->null_mask(), 0, current_valid_offset, last_offset);
valid = 0xff & mask;
} else {
valid = 0xff;
Expand Down

0 comments on commit 015f15c

Please sign in to comment.