Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bindings for index_of with column search key #10696
Add bindings for index_of with column search key #10696
Changes from 4 commits
5b3adf3
70b8200
1495620
577294a
7341cbd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Chrono" is a C++ / libcudf term and doesn't appear in the cuDF Python docs. The Python docs discuss datetimes and timedeltas.
To clarify my own understanding, what types are not supported here? It looks like this is exhaustive of the scalar types we support unless there's some catch for categorical or bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, upon further investigation it looks like bools are supported. But when using a list scalar as a search key on a multi-level nested list series
index_of
throws:List search operations are only supported on numeric types, decimals, chrono types, and strings.
Do you know how I might go about testing categorical types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your thorough testing! I'm not sure if "list of categorical" is a type that cuDF can represent. Given what you found, I think it would be okay to remove this note about limited type support. Essentially all scalar types that can be used in a list type are supported, from what I can tell.
Even though multi-level data is not supported, I am glad to see that multi-index searches fail with a real error message and not a complicated and cryptic traceback. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test case for multi-level nested data? (if that is supported)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still curious about multi-level nesting here. If multi-level nesting is supported, we'll need to revise a few other items as well. e.g.
is_scalar
might not be the appropriate check if "list scalars" are provided to check against a list of lists -- scalar-like input would have one fewer dimension / nested level that the input column, while column-like input would have an equal number of nested levels.