Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] off-by-one in parquet writer fragment offset calc #13431

Closed
wence- opened this issue May 24, 2023 · 3 comments · Fixed by #13434
Closed

[BUG] off-by-one in parquet writer fragment offset calc #13431

wence- opened this issue May 24, 2023 · 3 comments · Fixed by #13434
Labels
2 - In Progress Currently a work in progress bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.

Comments

@wence-
Copy link
Contributor

wence- commented May 24, 2023

Describe the bug

In src/io/parquet/writer_impl.cu, there is this bit of code that fills up a fragment_offsets vector (introduced in #12685)

  std::vector<size_type> frag_offsets;
  size_type const total_frags = [&]() {
    if (frags_per_column.size() > 0) {
      std::exclusive_scan(frags_per_column.data(),
                          frags_per_column.data() + num_columns + 1,
                          std::back_inserter(frag_offsets),
                          0);
      return frag_offsets[num_columns];
    } else {
      return 0;
    }
  }();

However, AFAICT, frags_per_column only has num_columns entries, so this exscan is UB because frags_per_column.data() + num_columns + 1 is two-past-the-end.

Steps/Code to reproduce bug

#include <cudf/column/column.hpp>
#include <cudf/io/parquet.hpp>
#include <cudf/table/table.hpp>
#include <memory>
#include <utility>
#include <vector>

int main() {
  std::vector<int> x{1, 2, 3, 4, 5, 6};

  auto column = std::make_unique(cudf::column(
      {cudf::type_id::INT32}, x.size(),
      {x.data(), x.size() * sizeof(int), cudf::get_default_stream()}, {}, 0));
  auto table = cudf::table{column};
  auto opts = cudf::io::parquet_writer_options()
                  .builder({"cudf.pq"}, table.view())
                  .build();
  auto result = cudf::io::write_parquet(opts);
  return 0;
}

Compile and run under valgrind:

==72255== Invalid read of size 4
==72255==    at 0x63548DD: decltype (((forward<int&>)({parm#1}))+((forward<int&>)({parm#2}))) std::plus<void>::operator()<int&, int&>(int&, int&) const (stl_function.h:252)
==72255==    by 0x63565A0: std::back_insert_iterator<std::vector<int, std::allocator<int> > > std::exclusive_scan<int*, std::back_insert_iterator<std::vector<int, std::allocator<int> > >, int, std::plus<void> >(int*, int*, std::back_insert_iterator<std::vector<int, std::allocator<int> > >, int, std::plus<void>) (numeric:480)
==72255==    by 0x634AD64: std::back_insert_iterator<std::vector<int, std::allocator<int> > > std::exclusive_scan<int*, std::back_insert_iterator<std::vector<int, std::allocator<int> > >, int>(int*, int*, std::back_insert_iterator<std::vector<int, std::allocator<int> > >, int) (numeric:510)
==72255==    by 0x633108A: cudf::io::detail::parquet::(anonymous namespace)::convert_table_to_parquet_data(cudf::io::table_input_metadata&, cudf::table_view const&, cudf::host_span<cudf::io::partition_info const, 18446744073709551615ul>, cudf::host_span<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const, 18446744073709551615ul>, std::unique_ptr<cudf::io::detail::parquet::aggregate_writer_metadata, std::default_delete<cudf::io::detail::parquet::aggregate_writer_metadata> > const&, std::optional<int>, unsigned long, unsigned long, int, int, int, cudf::io::statistics_freq, cudf::io::parquet::Compression, bool, cudf::io::dictionary_policy, unsigned long, cudf::io::detail::single_write_mode, bool, cudf::host_span<std::unique_ptr<cudf::io::data_sink, std::default_delete<cudf::io::data_sink> > const, 18446744073709551615ul>, rmm::cuda_stream_view)::{lambda()#7}::operator()() const (writer_impl.cu:1712)
==72255==    by 0x633271E: cudf::io::detail::parquet::(anonymous namespace)::convert_table_to_parquet_data(cudf::io::table_input_metadata&, cudf::table_view const&, cudf::host_span<cudf::io::partition_info const, 18446744073709551615ul>, cudf::host_span<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const, 18446744073709551615ul>, std::unique_ptr<cudf::io::detail::parquet::aggregate_writer_metadata, std::default_delete<cudf::io::detail::parquet::aggregate_writer_metadata> > const&, std::optional<int>, unsigned long, unsigned long, int, int, int, cudf::io::statistics_freq, cudf::io::parquet::Compression, bool, cudf::io::dictionary_policy, unsigned long, cudf::io::detail::single_write_mode, bool, cudf::host_span<std::unique_ptr<cudf::io::data_sink, std::default_delete<cudf::io::data_sink> > const, 18446744073709551615ul>, rmm::cuda_stream_view) (writer_impl.cu:1720)
==72255==    by 0x6334AFB: cudf::io::detail::parquet::writer::impl::write(cudf::table_view const&, std::vector<cudf::io::partition_info, std::allocator<cudf::io::partition_info> > const&)::{lambda()#1}::operator()() const (writer_impl.cu:2067)
==72255==    by 0x6334CF4: cudf::io::detail::parquet::writer::impl::write(cudf::table_view const&, std::vector<cudf::io::partition_info, std::allocator<cudf::io::partition_info> > const&) (writer_impl.cu:2093)
==72255==    by 0x63364F2: cudf::io::detail::parquet::writer::write(cudf::table_view const&, std::vector<cudf::io::partition_info, std::allocator<cudf::io::partition_info> > const&) (writer_impl.cu:2311)
==72255==    by 0x621594A: cudf::io::write_parquet(cudf::io::parquet_writer_options const&) (in /home/wence/Documents/src/rapids/cudf/cpp/build/cuda-11.8.0/wence__fix__13265-13270/release/libcudf.so)
==72255==    by 0x10B81C: main (in /home/wence/Documents/src/rapids/doodles/c++/cudf-parquet)
==72255==  Address 0x82026d64 is 0 bytes after a block of size 4 alloc'd
==72255==    at 0x4849013: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==72255==    by 0x63774A1: __gnu_cxx::new_allocator<int>::allocate(unsigned long, void const*) (new_allocator.h:127)
==72255==    by 0x636BD81: std::allocator_traits<std::allocator<int> >::allocate(std::allocator<int>&, unsigned long) (alloc_traits.h:464)
==72255==    by 0x636197B: std::_Vector_base<int, std::allocator<int> >::_M_allocate(unsigned long) (stl_vector.h:346)
==72255==    by 0x635FA90: std::_Vector_base<int, std::allocator<int> >::_M_create_storage(unsigned long) (stl_vector.h:361)
==72255==    by 0x63544CE: std::_Vector_base<int, std::allocator<int> >::_Vector_base(unsigned long, std::allocator<int> const&) (stl_vector.h:305)
==72255==    by 0x63495CA: std::vector<int, std::allocator<int> >::vector(unsigned long, int const&, std::allocator<int> const&) (stl_vector.h:523)
==72255==    by 0x6331E82: cudf::io::detail::parquet::(anonymous namespace)::convert_table_to_parquet_data(cudf::io::table_input_metadata&, cudf::table_view const&, cudf::host_span<cudf::io::partition_info const, 18446744073709551615ul>, cudf::host_span<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const, 18446744073709551615ul>, std::unique_ptr<cudf::io::detail::parquet::aggregate_writer_metadata, std::default_delete<cudf::io::detail::parquet::aggregate_writer_metadata> > const&, std::optional<int>, unsigned long, unsigned long, int, int, int, cudf::io::statistics_freq, cudf::io::parquet::Compression, bool, cudf::io::dictionary_policy, unsigned long, cudf::io::detail::single_write_mode, bool, cudf::host_span<std::unique_ptr<cudf::io::data_sink, std::default_delete<cudf::io::data_sink> > const, 18446744073709551615ul>, rmm::cuda_stream_view) (writer_impl.cu:1638)
==72255==    by 0x6334AFB: cudf::io::detail::parquet::writer::impl::write(cudf::table_view const&, std::vector<cudf::io::partition_info, std::allocator<cudf::io::partition_info> > const&)::{lambda()#1}::operator()() const (writer_impl.cu:2067)
==72255==    by 0x6334CF4: cudf::io::detail::parquet::writer::impl::write(cudf::table_view const&, std::vector<cudf::io::partition_info, std::allocator<cudf::io::partition_info> > const&) (writer_impl.cu:2093)
==72255==    by 0x63364F2: cudf::io::detail::parquet::writer::write(cudf::table_view const&, std::vector<cudf::io::partition_info, std::allocator<cudf::io::partition_info> > const&) (writer_impl.cu:2311)
==72255==    by 0x621594A: cudf::io::write_parquet(cudf::io::parquet_writer_options const&) (in /home/wence/Documents/src/rapids/cudf/cpp/build/cuda-11.8.0/wence__fix__13265-13270/release/libcudf.so)

Expected behavior

No out-of-bounds access.

If I understand this bit of code correctly, it is trying to produce an offsets array of length num_columns + 1 from a size array of length num_columns where the final entry should be the sum of all sizes (i.e. an inclusive_scan of the size array preceded by 0).

So I think this patch DTRT:

diff --git a/cpp/src/io/parquet/writer_impl.cu b/cpp/src/io/parquet/writer_impl.cu
index 05d42cd9e2..ea0535a514 100644
--- a/cpp/src/io/parquet/writer_impl.cu
+++ b/cpp/src/io/parquet/writer_impl.cu
@@ -1706,13 +1706,12 @@ auto convert_table_to_parquet_data(table_input_metadata& table_meta,
   // fragments with a (potentially) varying number of fragments per column.
 
   // first figure out the total number of fragments and calculate the start offset for each column
-  std::vector<size_type> frag_offsets;
+  std::vector<size_type> frag_offsets(num_columns + 1, 0);
   size_type const total_frags = [&]() {
     if (frags_per_column.size() > 0) {
-      std::exclusive_scan(frags_per_column.data(),
-                          frags_per_column.data() + num_columns + 1,
-                          std::back_inserter(frag_offsets),
-                          0);
+      std::inclusive_scan(frags_per_column.data(),
+                          frags_per_column.data() + num_columns,
+                          frag_offsets.data() + 1);
       return frag_offsets[num_columns];
     } else {
       return 0;

But there may be a more idiomatic way.

cc @etseidl, @ttnghia

@wence- wence- added bug Something isn't working Needs Triage Need team to review and classify libcudf Affects libcudf (C++/CUDA) code. labels May 24, 2023
@etseidl
Copy link
Contributor

etseidl commented May 24, 2023

We can't switch to an inclusive scan because the frag_offsets array is used later on, and is expected to start at offset 0. I remember wondering if this was valid at the time I wrote it, but thought I had seen the +1 used elsewhere. Probably better to just return frag_offsets[num_columns-1] + frags_per_column[num_columns-1] or resize frags_per_column to be of length num_columns + 1

@wence-
Copy link
Contributor Author

wence- commented May 24, 2023

Note that I changed the scan to initialise frag_offsets to be a vector of length num_columns + 1 of all zeros, and then pass a pointer to the first (not zeroth) element, so this does do, I think, what you want. Which, IIUC, is turn [3, 5, 7] (say) into [0, 3, 8, 15].

@etseidl
Copy link
Contributor

etseidl commented May 24, 2023

Note that I changed the scan to initialise frag_offsets to be a vector of length num_columns + 1 of all zeros, and then pass a pointer to the first (not zeroth) element, so this does do, I think, what you want. Which, IIUC, is turn [3, 5, 7] (say) into [0, 3, 8, 15].

ahh...missed that.

@GregoryKimball GregoryKimball added 2 - In Progress Currently a work in progress cuIO cuIO issue and removed Needs Triage Need team to review and classify labels May 24, 2023
@GregoryKimball GregoryKimball moved this to Burndown PRs in libcudf May 24, 2023
@GregoryKimball GregoryKimball removed the status in libcudf May 24, 2023
@GregoryKimball GregoryKimball removed this from libcudf Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants