Skip to content

Commit

Permalink
Fix logic in to_arrow for empty list column (#16279)
Browse files Browse the repository at this point in the history
An empty list column need not have empty children, it just needs to have zero length. In this case, the offsets array will have zero length, and we need to create a temporary buffer.

Now that this branch runs, fix two errors in the construction of the arrow array:

1. The element type, if there are children, should be taken from the child array;
2. If the child arrays are empty, we must make an empty null array, rather than passing a null pointer as the values array, otherwise we hit a segfault inside arrow.

The previous fix in #16201 correctly handled the empty children case (except for point two), but not the first case, which we do here.

Since we we're previously going down this code path (child_arrays was never empty), we never hit the latent segfault from point two.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - MithunR (https://github.com/mythrocks)

URL: #16279
  • Loading branch information
wence- authored Jul 16, 2024
1 parent beda22e commit 669db3e
Showing 1 changed file with 5 additions and 7 deletions.
12 changes: 5 additions & 7 deletions cpp/src/interop/to_arrow.cu
Original file line number Diff line number Diff line change
Expand Up @@ -378,13 +378,11 @@ std::shared_ptr<arrow::Array> dispatch_to_arrow::operator()<cudf::list_view>(
auto children_meta =
metadata.children_meta.empty() ? std::vector<column_metadata>{{}, {}} : metadata.children_meta;
auto child_arrays = fetch_child_array(input_view, children_meta, ar_mr, stream);
if (child_arrays.empty()) {
// Empty list will have only one value in offset of 4 bytes
auto tmp_offset_buffer = allocate_arrow_buffer(sizeof(int32_t), ar_mr);
memset(tmp_offset_buffer->mutable_data(), 0, sizeof(int32_t));

return std::make_shared<arrow::ListArray>(
arrow::list(arrow::null()), 0, std::move(tmp_offset_buffer), nullptr);
if (child_arrays.empty() || child_arrays[0]->data()->length == 0) {
auto element_type = child_arrays.empty() ? arrow::null() : child_arrays[1]->type();
auto result = arrow::MakeEmptyArray(arrow::list(element_type), ar_mr);
CUDF_EXPECTS(result.ok(), "Failed to construct empty arrow list array\n");
return result.ValueUnsafe();
}

auto offset_buffer = child_arrays[0]->data()->buffers[1];
Expand Down

0 comments on commit 669db3e

Please sign in to comment.