Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Memcheck error in ParquetChunkedReaderInputLimitTest.Mixed #15690

Closed
davidwendt opened this issue May 7, 2024 · 1 comment · Fixed by #15735
Closed

[BUG] Memcheck error in ParquetChunkedReaderInputLimitTest.Mixed #15690

davidwendt opened this issue May 7, 2024 · 1 comment · Fixed by #15735
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.

Comments

@davidwendt
Copy link
Contributor

Changes to this test in PR #15672 report an out-of-bounds read error with compute-sanitizer.

Using the changes from #15672 run the following to see the error

compute-sanitizer --tool memcheck gtests/PARQUET_TEST --gtest_filter=ParquetChunkedReaderInputLimitTest.Mixed --rmm_mode=cuda

Partial result

========= COMPUTE-SANITIZER
Note: Google Test filter = ParquetChunkedReaderInputLimitTest.Mixed
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from ParquetChunkedReaderInputLimitTest
[ RUN      ] ParquetChunkedReaderInputLimitTest.Mixed
========= Invalid __global__ read of size 8 bytes
=========     at 0x7f0 in void thrust::cuda_cub::core::_kernel_agent<thrust::cuda_cub::__parallel_for::ParallelForAgent<thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::<unnamed>::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, long>, thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, cudf::io::parquet::detail::<unnamed>::cumulative_page_info *, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::<unnamed>::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, long>(T2, T3)
=========     by thread (219,0,0) in block (2,0,0)
=========     Address 0x7f59441e96a8 is out of bounds
=========     and is 9 bytes after the nearest allocation at 0x7f59441e0a00 of size 36,000 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame: [0x2ef370]
=========                in /usr/lib/x86_64-linux-gnu/libcuda.so.1
=========     Host Frame: [0x1488c]
=========                in /conda/envs/rapids/lib/libcudart.so.11.0
=========     Host Frame:cudaLaunchKernel [0x6c318]
=========                in /conda/envs/rapids/lib/libcudart.so.11.0
=========     Host Frame:cudaError thrust::cuda_cub::__parallel_for::parallel_for<thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::(anonymous namespace)::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, long>(long, thrust::cuda_cub::__transform::unary_transform_f<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info*, thrust::cuda_cub::__transform::no_stencil_tag, cudf::io::parquet::detail::(anonymous namespace)::page_total_size, thrust::cuda_cub::__transform::always_true_predicate>, CUstream_st*) [0x149a79a]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::(anonymous namespace)::adjust_cumulative_sizes(cudf::device_span<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info const, 18446744073709551615ul>, cudf::device_span<cudf::io::parquet::detail::PageInfo const, 18446744073709551615ul>, rmm::cuda_stream_view) [0x1483e3b]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::(anonymous namespace)::compute_next_subpass(cudf::device_span<cudf::io::parquet::detail::(anonymous namespace)::cumulative_page_info const, 18446744073709551615ul>, cudf::device_span<cudf::io::parquet::detail::PageInfo const, 18446744073709551615ul>, cudf::device_span<cudf::io::parquet::detail::ColumnChunkDesc const, 18446744073709551615ul>, cudf::device_span<int const, 18446744073709551615ul>, unsigned long, unsigned long, unsigned long, rmm::cuda_stream_view) [0x148762a]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::setup_next_subpass(bool)::{lambda()#1}::operator()() const [0x148a7db]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::setup_next_subpass(bool) [0x1491b75]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::parquet::detail::reader::impl::has_next() [0x146c362]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:cudf::io::chunked_parquet_reader::has_next() const [0x131a37f]
=========                in /conda/envs/rapids/lib/libcudf.so
=========     Host Frame:(anonymous namespace)::chunked_read(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, unsigned long, unsigned long) [0xe1786]
=========                in /cudf/cpp/build/gtests/PARQUET_TEST

@davidwendt davidwendt added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. labels May 7, 2024
@davidwendt
Copy link
Contributor Author

@nvdbaranec

rapids-bot bot pushed a commit that referenced this issue May 16, 2024
Fixes  #15690

There was an issue when computing page row counts/indices at the pass level in the chunked reader.  Because we estimate list row counts for pages we have not yet decompressed, this can sometimes lead to estimates row counts that are larger than the actual (known) number of rows for a pass.  This caused an out-of-bounds read down the line.  We were already handling this at the subpass level, just not at the pass level.

Also includes some fixes in debug logging code that is #ifdef'd out.

Authors:
  - https://github.com/nvdbaranec
  - David Wendt (https://github.com/davidwendt)
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15735
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant