Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] ListArray's Flatten function should support recursive flattening of nested lists #41055

Closed
ZhangHuiGui opened this issue Apr 7, 2024 · 3 comments

Comments

@ZhangHuiGui
Copy link
Collaborator

ZhangHuiGui commented Apr 7, 2024

Describe the enhancement requested

The function FlattenListArray seems only support flat the list array like list(int32),

std::vector<std::shared_ptr<Array>> non_null_fragments;
int64_t valid_begin = 0;
while (valid_begin < list_array_length) {
int64_t valid_end = valid_begin;
while (valid_end < list_array_length &&
(list_array.IsValid(valid_end) || list_array.value_length(valid_end) == 0)) {
++valid_end;
}
if (valid_begin < valid_end) {
non_null_fragments.push_back(
SliceArrayWithOffsets(*value_array, list_array.value_offset(valid_begin),
list_array.value_offset(valid_end)));
}
valid_begin = valid_end + 1; // skip null entry
}

Users need to flatten nested list-arrays( list(list(int32)) ), and then the upper layer needs to implement recursive flatten, which seems inconvenient for them.

@felipecrv do you think we should implement this internally(and also for ListViewArray)?

Component(s)

C++

@felipecrv
Copy link
Contributor

felipecrv commented Apr 8, 2024

Users need to flatten nested list-arrays( list(list(int32)) ), and then the upper layer needs to implement recursive flatten, which seems inconvenient for them.

Right. But we can't change the behavior of existing Flatten to go recursive, but we can add a version that works recursively (and is not naively doing successive concatenations). And yes, it would treat all of (Large)?List(View)? as logical lists that need to be flattened until a non-list-like type is reached.

It would have to be a single implementation that dispatches based on the type->id() of the array because all combinations of list and list-view nesting should be supported.

@felipecrv
Copy link
Contributor

And we shouldn't forget fixed_size_list either. It's logically a list as well.

felipecrv pushed a commit that referenced this issue Apr 15, 2024
…es (#41092)

### Rationale for this change
Support flatten for combining nested list related types.

### What changes are included in this PR?
Add the recursively flatten function for auto detect and flatten the combining nested list types.

### Are these changes tested?
Yes

### Are there any user-facing changes?
Yes, user can flatten a combining nested-list or related array by use `Flatten` API.

* GitHub Issue: #41055

Authored-by: ZhangHuiGui <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
@felipecrv
Copy link
Contributor

Issue resolved by pull request 41092
#41092

@felipecrv felipecrv added this to the 17.0.0 milestone Apr 15, 2024
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…ed types (apache#41092)

### Rationale for this change
Support flatten for combining nested list related types.

### What changes are included in this PR?
Add the recursively flatten function for auto detect and flatten the combining nested list types.

### Are these changes tested?
Yes

### Are there any user-facing changes?
Yes, user can flatten a combining nested-list or related array by use `Flatten` API.

* GitHub Issue: apache#41055

Authored-by: ZhangHuiGui <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…ed types (apache#41092)

### Rationale for this change
Support flatten for combining nested list related types.

### What changes are included in this PR?
Add the recursively flatten function for auto detect and flatten the combining nested list types.

### Are these changes tested?
Yes

### Are there any user-facing changes?
Yes, user can flatten a combining nested-list or related array by use `Flatten` API.

* GitHub Issue: apache#41055

Authored-by: ZhangHuiGui <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants