Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] INTEROP_TEST fails on a libcudf debug build at ToArrowTest.NestedList #17153

Closed
davidwendt opened this issue Oct 23, 2024 · 4 comments · Fixed by #17405
Closed

[BUG] INTEROP_TEST fails on a libcudf debug build at ToArrowTest.NestedList #17153

davidwendt opened this issue Oct 23, 2024 · 4 comments · Fixed by #17405
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.

Comments

@davidwendt
Copy link
Contributor

The following assert occurs in the ToArrowTest.NestedList test in INTEROP_TEST:

$ gtests/INTEROP_TEST --gtest_filter=ToArrowTest.NestedList
Note: Google Test filter = ToArrowTest.NestedList
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from ToArrowTest
[ RUN      ] ToArrowTest.NestedList
/cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/array/array_nested.cc:468:  Check failed: self->list_type_->value_type()->Equals(data->child_data[0]->type) 
Aborted (core dumped)

This only occurs in a debug build likely because the assert is compiled out in a release build.

@davidwendt davidwendt added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. labels Oct 23, 2024
@davidwendt
Copy link
Contributor Author

@Vyas @zeroshade
Any idea why this DCHECK is failing in the ToArrowTest.NestedList test?
https://github.com/apache/arrow/blob/7d5a8186a514c57c3db2118677c01f61956396fe/cpp/src/arrow/array/array_nested.cc#L468

Code fails here:

auto nested_list_arr = std::make_shared<arrow::ListArray>(
arrow::list(arrow::field("a", arrow::list(arrow::int64()), false)),
offset.size() - 1,
arrow::Buffer::Wrap(offset),
list_arr,
mask_buffer);

The call stack looks like this:

#0  0x00007fd1cb9619fc in pthread_kill () from /usr/lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fd1cb90d476 in raise () from /usr/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fd1cb8f37f3 in abort () from /usr/lib/x86_64-linux-gnu/libc.so.6
#3  0x000055afad658734 in arrow::util::CerrLog::~CerrLog (this=0x55afb0ef4780, __in_chrg=<optimized out>)
    at /cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/util/logging.cc:74
#4  0x000055afad658750 in arrow::util::CerrLog::~CerrLog (this=0x55afb0ef4780, __in_chrg=<optimized out>)
    at /cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/util/logging.cc:76
#5  0x000055afad65891b in arrow::util::ArrowLog::~ArrowLog (this=0x7ffd2573e640, __in_chrg=<optimized out>)
    at /cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/util/logging.cc:253
#6  0x000055afad2a590f in arrow::internal::SetListData<arrow::ListType> (self=0x55afb1b0a9f0, data=..., 
    expected_type_id=arrow::Type::LIST) at /cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/array/array_nested.cc:468
#7  0x000055afad2a01aa in arrow::ListArray::SetData (this=0x55afb1b0a9f0, data=...)
    at /cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/array/array_nested.cc:494
#8  0x000055afad2a00a8 in arrow::ListArray::ListArray (this=0x55afb1b0a9f0, type=..., length=2, value_offsets=..., values=..., 
    null_bitmap=..., null_count=-1, offset=0) at /cudf/cpp/build/_deps/arrow-src/cpp/src/arrow/array/array_nested.cc:490
#9  0x000055afacc82967 in __gnu_cxx::new_allocator<arrow::ListArray>::construct<arrow::ListArray, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (this=0x7ffd2573e9ff, 
    __p=0x55afb1b0a9f0) at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/ext/new_allocator.h:162
#10 0x000055afacc785b9 in std::allocator_traits<std::allocator<arrow::ListArray> >::construct<arrow::ListArray, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (__a=..., 
    __p=0x55afb1b0a9f0) at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/alloc_traits.h:516
#11 0x000055afacc6ebad in std::_Sp_counted_ptr_inplace<arrow::ListArray, std::allocator<arrow::ListArray>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (this=0x55afb1b0a9e0, __a=...)
    at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/shared_ptr_base.h:519
#12 0x000055afacc64463 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<arrow::ListArray, std::allocator<arrow::ListArray>, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (this=0x7ffd2573edc8, __p=@0x7ffd2573edc0: 0x0, __a=...)
    at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/shared_ptr_base.h:650
#13 0x000055afacc58ce0 in std::__shared_ptr<arrow::ListArray, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<arrow::ListArray>, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (this=0x7ffd2573edc0, __tag=...) at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/shared_ptr_base.h:1342
#14 0x000055afacc4cd68 in std::shared_ptr<arrow::ListArray>::shared_ptr<std::allocator<arrow::ListArray>, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (this=0x7ffd2573edc0, 
    __tag=...) at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/shared_ptr.h:409
#15 0x000055afacc410ea in std::allocate_shared<arrow::ListArray, std::allocator<arrow::ListArray>, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> (__a=...)
    at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/shared_ptr.h:863
#16 0x000055afacc32aff in std::make_shared<arrow::ListArray, std::shared_ptr<arrow::DataType>, unsigned long, std::shared_ptr<arrow::Buffer>, std::shared_ptr<arrow::Array>&, std::shared_ptr<arrow::Buffer>&> ()
    at /conda/envs/rapids/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/shared_ptr.h:879
#17 0x000055afacc1975a in ToArrowTest_NestedList_Test::TestBody (this=0x55afb0c4f690) at /cudf/cpp/tests/interop/to_arrow_test.cpp:270

@zeroshade
Copy link
Contributor

That particular DCHECK is validating that the value_type of the list is equal to the data type of the child of the list. Most likely it indicates an issue in properly propagating types.

Given the code snippet you provided, it is checking that list_arr->data_type() should be arrow::list(arrow::int64(), false). So I'm guessing something is borked with that type propagation for the nested lists. I can try to dig into this next week.

@wence-
Copy link
Contributor

wence- commented Nov 20, 2024

@zeroshade Did you get a chance to look at this issue?

@wence- wence- added this to libcudf Nov 20, 2024
@wence- wence- moved this to Needs owner in libcudf Nov 20, 2024
@zeroshade
Copy link
Contributor

Sorry for the delay here, I'm looking into this right now, I'll hopefully have something this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.
Projects
Status: Needs owner
Development

Successfully merging a pull request may close this issue.

3 participants