-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support get_element
from LIST column
#8071
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-0.20 #8071 +/- ##
===============================================
- Coverage 82.88% 82.88% -0.01%
===============================================
Files 103 104 +1
Lines 17668 17899 +231
===============================================
+ Hits 14645 14836 +191
- Misses 3023 3063 +40
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, just suggest refactoring the single-element is_valid to a utility.
The behavior seems not right to me. You have a lists column. You access a row in that list column. That row is always a list. Thus, the return list scalar should always have the |
@ttnghia currently For example, in a non-nested list column:
For a nested list column:
See discussion here: #5887 (comment) |
I should probably add the definition of |
Co-authored-by: Mark Harris <[email protected]>
Co-authored-by: Mark Harris <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment here still stands. #8071 (comment) We should be using gather.
Added scalar factory methods, and updated developer guide accordingly. |
Discussed offline and confirmed current implementation supports structs. |
It seems that there are some failed tests. |
Co-authored-by: nvdbaranec <[email protected]>
@gpucibot merge |
…8206) This PR fixes a bug introduced in #8071, when `get_element` retrieves a NULL row in a nested column, the scalar returned not only should be `is_valid() == false`, but also should preserve the column hierarchy of the row-data, even they are invalid. Because depending libraries may use the column hierarchy to deduce the nested type of the column. This PR also reverts `make_default_constructed_scalar` API for `LIST` type. A `LIST` type scalar should have complete column hierarchy as part of its type information. There isn't enough information provided to the API to construct that. Another tiny addition: instead of hard coding the position of child column, use `list_column_view::child_column_index` intead. Authors: - Michael Wang (https://github.com/isVoid) Approvers: - Conor Hoekstra (https://github.com/codereport) - Paul Taylor (https://github.com/trxcllnt) URL: #8206
Part1 of #8032
This PR adds retrieval of row data from a
LIST
type column, through adding support tolist_view
specialization ofget_element
. The row data is stored in a scalar object.Use example:
Implementation note:
Depends on
lists::detail::copy_slice
under the hood. Also adds a newlist_scalar
constructor that supports moving external row data to construct a new scalar.Other included in this PR:
is_element_valid_sync(column, i)
, helper function that returns true ifi
th row ofcolumn
is valid.list_scalar
factory functionslist_scalar