-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support structs of lists in row lexicographic comparator #13005
Support structs of lists in row lexicographic comparator #13005
Conversation
No, if Note that the structs column may be "wrongly" constructed such that such property is not enforced. In such cases, such struct column must be corrected by Edit: I was wrong. The lex. comparator preprocessor doesn't call |
For the example you posted, you may get the correct structs column because the |
In my example, it's not a top-level NULL though, is it? i and S2 are both children of S1. I'm not saying S1 is NULL, I'm saying i is NULL. Let me try to write an MRE for this, maybe I am wrong entirely. |
For the example above, you will have this table row:
|
But depth is already returned before that, isn't it? Regardless, it looks like I am wrong. This test passes:
I just don't understand how since according to the written example When you find a NULL in Anyway, thank you very much for walking me through this. Tomorrow I'll print the |
Initially, the depth array is
depth=1 returned.
The returned depth is the depth value at which null is detected, not the original depth value of the top level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of questions to make sure I follow the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks very close, and I really like the test coverage. @ttnghia is there anything else that you are looking to finish?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one open request to clarify a comment, otherwise LGTM!
cpp/src/table/row_operators.cu
Outdated
temp_col.size(), | ||
temp_col.head(), | ||
temp_col.null_mask(), | ||
UNKNOWN_NULL_COUNT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the null count be known here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be queried by temp_col.null_count()
which may trigger a kernel launch (now). Good point, I think with our plan to remove UNKNOWN_NULL_COUNT
then using temp_col.null_count()
here would avoid modifying it again in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is also addressed in #13102.
Co-authored-by: Bradley Dice <[email protected]>
/merge |
This implements support for lexicographic comparison for lists-of-structs, following the proposed idea in #11222: * The child column of the lists-of-structs column is replaced by an integer column of its rank values. * In the cases of comparing two tables, such child columns from both tables are concatenated, ranked, then split back into new child columns to replace the original child columns for each table. Depends on: * #13005 Closes #11222. Authors: - Nghia Truong (https://github.com/ttnghia) - Karthikeyan (https://github.com/karthikeyann) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #12953
This fixes the lexicographic comparator that cannot handle the input having structs of lists. The new implementation mainly changes the helper functions
decompose_structs
. In particular:Struct<Struct<...<List<SomeType>...>
(i.e., nested structs ultimately having one child).Struct<...Struct<>>...>
. The innermost child columnList<SomeType>
is output as the second column in the result table.Depends on:
Closes #11672.