-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Fix list
and struct
meta generation issue in dask-cudf
#10434
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #10434 +/- ##
================================================
+ Coverage 86.13% 86.17% +0.03%
================================================
Files 139 139
Lines 22438 22466 +28
================================================
+ Hits 19328 19361 +33
+ Misses 3110 3105 -5
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great - Thanks @galipremsagar !
Just a few questions/suggestions related to comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions attached. 👍
Co-authored-by: Bradley Dice <[email protected]>
Co-authored-by: Bradley Dice <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I have one comment to double-check. I'm not sure if I made an error in my previous suggestion with respect to number of nesting levels.
data = ["cat", "dog"] | ||
else: | ||
data = np.array([0, 1], dtype=leaf_type).tolist() | ||
data = _nest_list_data(data, s.dtype) * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does s.dtype
need to be s.dtype.leaf_type
? I'm not sure if this is nesting an extra level or if this is the right number of levels. It might have been changed with my previous round of suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaf_type
corresponds to the terminal type of ListDtype
, i.e., like int32
, int64
..
by passing s.dtype
(ListDtype
) we traverse through each inner type by accessing .element_type
in _nest_list_data
such that we get the true nesting levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
>>> s.dtype
ListDtype(ListDtype(int64))
>>> s.leaf_type
int64
>>> s.element_type
ListDtype(int64)
@gpucibot merge |
Fixes: #8913
This PR fixes multiple code-paths that incorrectly handle
list
andstruct
types for metadata generation indask-cudf
.