Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Memcheck error found in PARQUET_TEST ParquetChunkedReaderInputLimitTest.Mixed #14883

Closed
davidwendt opened this issue Jan 25, 2024 · 0 comments · Fixed by #14889
Closed
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.

Comments

@davidwendt
Copy link
Contributor

Describe the bug
The nightly builds memcheck action reports an error in the PARQUET_TEST ParquetChunkedReaderInputLimitTest.Mixed.
The error reports an 8-byte read past the end of a device buffer.
It appears that #14360 introduced this error.

Steps/Code to reproduce bug
Run the PARQUET_TEST using compute-sanitizer as follows:

compute-sanitizer --tool memcheck gtests/PARQUET_TEST --gtest_filter=ParquetChunkedReaderInputLimitTest.Mixed --rmm_mode=cuda
[ RUN      ] ParquetChunkedReaderInputLimitTest.Mixed
========= Invalid __global__ read of size 8 bytes
=========     at 0x440 in void thrust::cuda_cub::core::_kernel_agent<thrust::cuda_cub::__parallel_for::ParallelForAgent<thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::<unnamed>::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, long>, thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::<unnamed>::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, long>(T2, T3)
=========     by thread (1,0,0) in block (4,0,0)
=========     Address 0x7fe0ffd9ec38 is out of bounds
=========     and is 9 bytes after the nearest allocation at 0x7fe0ffd92c00 of size 49,200 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame: [0x3344e0]
=========                in /usr/lib/x86_64-linux-gnu/libcuda.so.1
=========     Host Frame: [0x1488c]
=========                in /conda/envs/rapids/lib/libcudart.so.11.0
=========     Host Frame:cudaLaunchKernel [0x6c318]
=========                in /conda/envs/rapids/lib/libcudart.so.11.0
=========     Host Frame:cudaError thrust::cuda_cub::__parallel_for::parallel_for<thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::(anonymous namespace)::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, long>(long, thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::(anonymous namespace)::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, CUstream_st*) [0x1375a5a]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::(anonymous namespace)::adjust_cumulative_sizes(cudf::device_span<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info const, 18446744073709551615ul>, cudf::device_span<cudf::io::parquet::detail::PageInfo const, 18446744073709551615ul>, rmm::cuda_stream_view) [0x13613f9]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::(anonymous namespace)::compute_page_splits_by_row(cudf::device_span<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info const, 18446744073709551615ul>, cudf::device_span<cudf::io::parquet::detail::PageInfo const, 18446744073709551615ul>, unsigned long, unsigned long, unsigned long, rmm::cuda_stream_view) [0x1364a24]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::compute_output_chunks_for_subpass() [0x1367b6c]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::preprocess_subpass_pages(bool, unsigned long) [0x1394e48]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::setup_next_subpass(bool) [0x136d0b8]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::read_chunk() [0x135138e]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::chunked_reader::read_chunk() const [0x1343653]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::chunked_parquet_reader::read_chunk() const [0x1200266]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:(anonymous namespace)::chunked_read(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long, unsigned long) [0xebd1c]
@davidwendt davidwendt added bug Something isn't working Needs Triage Need team to review and classify libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue labels Jan 25, 2024
rapids-bot bot pushed a commit that referenced this issue Jan 25, 2024
…hunking. (#14889)

Fixes  #14883

The core issue was that the output chunking code was expecting all columns to have terminating pages that end in the same row count.  Previously this was the case because we always processed entire row groups.  But now with the subrowgroup reader, we can split on page boundaries that cause a jagged max row index for different columns.  Example:

```
             0       100             200
Col A     [-----------][--------------]      300
Col B     [-----------][----------------------]
```

The input chunking would have computed a max row index of 200 for the subpass.  But when computing the _output_ chunks, there was code that would have tried finding where row 300 was in column A, resulting in an out-of-bounds read.

The fix is simply to cap the max row seen for column B to be the max expected row for the subpass.

Authors:
  - https://github.com/nvdbaranec

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Vukasin Milovanovic (https://github.com/vuule)
  - Mike Wilson (https://github.com/hyperbolic2346)

URL: #14889
@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants