Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Segfault in libcudf debug build running PARQUET_TEST #9935

Closed
davidwendt opened this issue Dec 18, 2021 · 1 comment · Fixed by #9938
Closed

[BUG] Segfault in libcudf debug build running PARQUET_TEST #9935

davidwendt opened this issue Dec 18, 2021 · 1 comment · Fixed by #9938
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.

Comments

@davidwendt
Copy link
Contributor

Something changed in the last week or so in 22.02 causing this failure. The segfault only happens in a libcudf DEBUG build.

# gtests/PARQUET_TEST 
[==========] Running 114 tests from 35 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from ParquetWriterNumericTypeTest/0, where TypeParam = signed char
[ RUN      ] ParquetWriterNumericTypeTest/0.SingleColumn
Segmentation fault (core dumped)

@davidwendt davidwendt added bug Something isn't working Needs Triage Need team to review and classify libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue labels Dec 18, 2021
@davidwendt
Copy link
Contributor Author

Here is a cuda-gdb session in case it is helpful.

# cuda-gdb gtests/PARQUET_TEST 
NVIDIA (R) CUDA Debugger
11.5 release
Portions Copyright (C) 2007-2021 NVIDIA Corporation
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from gtests/PARQUET_TEST...
(cuda-gdb) run
Starting program: /cudf/cpp/build/gtests/PARQUET_TEST 
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fc2c77ff000 (LWP 34560)]
[Detaching after fork from child process 34561]
[New Thread 0x7fc2c0ffe000 (LWP 34565)]
[New Thread 0x7fc2affff000 (LWP 34566)]
[New Thread 0x7fc2af7fe000 (LWP 34567)]
[==========] Running 114 tests from 35 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from ParquetWriterNumericTypeTest/0, where TypeParam = signed char
[ RUN      ] ParquetWriterNumericTypeTest/0.SingleColumn

Thread 1 "PARQUET_TEST" received signal SIGSEGV, Segmentation fault.
0x000055844fd6fd73 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::_S_right (__x=0x67770)
    at /usr/include/c++/9/bits/stl_tree.h:798
798	      { return static_cast<_Link_type>(__x->_M_right); }
(cuda-gdb) bt
#0  0x000055844fd6fd73 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::_S_right (__x=0x67770)
    at /usr/include/c++/9/bits/stl_tree.h:798
#1  0x000055844fd4c955 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::_M_erase (this=0x558498e25b28, 
    __x=0x67770) at /usr/include/c++/9/bits/stl_tree.h:1913
#2  0x000055844fd4c967 in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::_M_erase (this=0x558498e25b28, 
    __x=0x558498e2a7f8) at /usr/include/c++/9/bits/stl_tree.h:1913
#3  0x000055844fd2dbba in std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::~_Rb_tree (this=0x558498e25b28, 
    __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_tree.h:995
#4  0x000055844fd186a8 in std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloca--Type <RET> for more, q to quit, c to continue without paging--
tor<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::~map (
    this=0x558498e25b28, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/stl_map.h:300
#5  0x00007fc2d4b07ea8 in cudf::io::detail::parquet::aggregate_metadata::~aggregate_metadata (
    this=0x558498e25b10, __in_chrg=<optimized out>) at /cudf/cpp/src/io/parquet/reader_impl.cu:304
#6  0x00007fc2d4b07ede in std::default_delete<cudf::io::detail::parquet::aggregate_metadata>::operator() (this=0x558498e2ca88, __ptr=0x558498e25b10) at /usr/include/c++/9/bits/unique_ptr.h:81
#7  0x00007fc2d4b03a72 in std::unique_ptr<cudf::io::detail::parquet::aggregate_metadata, std::default_delete<cudf::io::detail::parquet::aggregate_metadata> >::~unique_ptr (this=0x558498e2ca88, 
    __in_chrg=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:292
#8  0x00007fc2d4b3c3a4 in cudf::io::detail::parquet::writer::impl::~impl (this=0x558498e2ca60, 
    __in_chrg=<optimized out>) at /cudf/cpp/src/io/parquet/writer_impl.cu:1175
#9  0x00007fc2d4b5044c in std::default_delete<cudf::io::detail::parquet::writer::impl>::operator() (
    this=0x558498e2b790, __ptr=0x558498e2ca60) at /usr/include/c++/9/bits/unique_ptr.h:81
#10 0x00007fc2d4b4a2b2 in std::unique_ptr<cudf::io::detail::parquet::writer::impl, std::default_delete<cudf::io::detail::parquet::writer::impl> >::~unique_ptr (this=0x558498e2b790, 
    __in_chrg=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:292
#11 0x00007fc2d4b3fcba in cudf::io::detail::parquet::writer::~writer (this=0x558498e2b790, 
    __in_chrg=<optimized out>) at /cudf/cpp/include/cudf/io/detail/parquet.hpp:83
#12 0x000055844fd52abc in std::default_delete<cudf::io::detail::parquet::writer>::operator() (
    this=0x7ffc27276558, __ptr=0x558498e2b790) at /usr/include/c++/9/bits/unique_ptr.h:81
#13 0x000055844fd313e8 in std::unique_ptr<cudf::io::detail::parquet::writer, std::default_delete<cudf::io::detail::parquet::writer> >::~unique_ptr (this=0x7ffc27276558, __in_chrg=<optimized out>)
    at /usr/include/c++/9/bits/unique_ptr.h:292
#14 0x00007fc2d49b837e in cudf::io::write_parquet (options=..., mr=0x558450cfc860)
    at /cudf/cpp/src/io/functions.cpp:457
#15 0x000055844fe6b6e3 in ParquetWriterNumericTypeTest_SingleColumn_Test<signed char>::TestBody (
    this=0x5584512a4880) at /cudf/cpp/tests/io/parquet_test.cpp:285
#16 0x00007fc2cc59ca99 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x5584512a4880, method=&virtual testing::Test::TestBody(), 
    location=0x7fc2cc5b1ddb "the test body")
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2433
#17 0x00007fc2cc5951b1 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>
    (object=0x5584512a4880, method=&virtual testing::Test::TestBody(), 
    location=0x7fc2cc5b1ddb "the test body")
--Type <RET> for more, q to quit, c to continue without paging--
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2469
#18 0x00007fc2cc56f556 in testing::Test::Run (this=0x5584512a4880)
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2508
#19 0x00007fc2cc56ff41 in testing::TestInfo::Run (this=0x558450af19e0)
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2684
#20 0x00007fc2cc570699 in testing::TestSuite::Run (this=0x558450ce87d0)
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2816
#21 0x00007fc2cc57c843 in testing::internal::UnitTestImpl::RunAllTests (this=0x558450a84ab0)
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:5338
#22 0x00007fc2cc59dfc2 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x558450a84ab0, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x7fc2cc57c426 <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=0x7fc2cc5b2818 "auxiliary test code (environments or event listeners)")
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2433
#23 0x00007fc2cc5963ef in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x558450a84ab0, 
    method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x7fc2cc57c426 <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=0x7fc2cc5b2818 "auxiliary test code (environments or event listeners)")
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:2469
#24 0x00007fc2cc57b04d in testing::UnitTest::Run (
    this=0x7fc2cc5dc5e0 <testing::UnitTest::GetInstance()::instance>)
    at /cudf/cpp/build/_deps/gtest-src/googletest/src/gtest.cc:4925
#25 0x000055844fd0ec41 in RUN_ALL_TESTS () at /conda/envs/rapids/include/gtest/gtest.h:2473
#26 0x000055844fcd9ed0 in main (argc=1, argv=0x7ffc27277098)
    at /cudf/cpp/tests/io/parquet_test.cpp:3202
(cuda-gdb) quit

Also, after running this on the latest code I get this interesting result:

# gtests/PARQUET_TEST 
[==========] Running 114 tests from 35 test suites.
[----------] Global test environment set-up.
[----------] 2 tests from ParquetWriterNumericTypeTest/0, where TypeParam = signed char
[ RUN      ] ParquetWriterNumericTypeTest/0.SingleColumn
double free or corruption (!prev)
Aborted (core dumped)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants