-
Notifications
You must be signed in to change notification settings - Fork 920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruct dtypes correctly for list aggs of struct columns #12290
Reconstruct dtypes correctly for list aggs of struct columns #12290
Conversation
da801f5
to
4edf071
Compare
Codecov ReportBase: 86.58% // Head: 85.68% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #12290 +/- ##
================================================
- Coverage 86.58% 85.68% -0.90%
================================================
Files 155 155
Lines 24368 24868 +500
================================================
+ Hits 21098 21309 +211
- Misses 3270 3559 +289
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
4edf071
to
fe6fdbc
Compare
As usual when returning from libcudf, we need to reconstruct a struct dtype with appropriate labels. For groupby.agg(list) this can be done by matching on the element_type of the result column and reconstructing with a new list dtype with a leaf from the original column. Closes rapidsai#11765.
fe6fdbc
to
7fa7ee1
Compare
We can't transfer the categorical dtype into an element of the output list dtype right now, so the reconstruction with appropriate dtype metadata is not possible. To avoid confusion, just remove support.
One cibuildwheel run failed (I guess due to network timeouts, it looks like). Is there a way to restart this, or do I just push a(nother) merge commit and cross my fingers? |
If you click on the "Details" link next to a failed check it will take you into the relevant "Actions" section. There you'll see a "Re-run jobs" in the top right, which you can use to only rerun failed tests. |
FWIW the latest failure appears to be the same one that @madsbk observed in #12554 (comment) and merging again seems to have fixed it (I didn't look into the underlying cause upstream, perhaps something related to the new recent pandas version causing the xfailed test to succeed, or maybe some transient inconsistency with some upstream being pulled? It's very weird that it's an xfail-related failure though, so not sure without digging further.). |
/merge |
Description
As usual when returning from libcudf, we need to reconstruct a struct
dtype with appropriate labels. For groupby.agg(list) this can be done
by matching on the element_type of the result column and
reconstructing with a new list dtype with a leaf from the original
column.
Closes #11765
Closes #11907
Checklist