-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add nested struct support for cudf::contains #8965
Comments
This issue has been labeled |
This PR adds support for `cudf::contains` so we can check whether a structs column contains a scalar struct element. Partially addresses #8965. This does not support checking if structs given in a structs column exist in another structs column. Such cases will be supported when the new data structure mentioned in #9413 is merged into cudf. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Mike Wilson (https://github.com/hyperbolic2346) - MithunR (https://github.com/mythrocks) URL: #9929
This issue has been labeled |
This is still wanted |
Please note that this has already been partially supported in #9929 |
This extends the `cudf::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `cudf::contains` will work with literally any type of input data. In addition, this fixes null handling of `cudf::contains` with structs column + struct scalar input when the structs column contains null rows at the top level while the scalar key is valid but all nulls at children levels. Closes: #8965 Depends on: * #10730 * #10883 * #10802 * #10997 * NVIDIA/cuCollections#172 * NVIDIA/cuCollections#173 * #11037 * #11356 Authors: - Nghia Truong (https://github.com/ttnghia) - Devavret Makkar (https://github.com/devavret) - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) URL: #10656
Is your feature request related to a problem? Please describe.
For Spark we are pushing to get more support for structs in a number of operators. We already have some support for sorting structs, so we should be able to come up with a way to check for existence of a struct value from one column to another. NOTE this does not include lists as children of the structs just structs that contains basic types including strings and other structs.
his should follow the same pattern we have supported for sorting where null child columns are considered equal to other null child columns. Like described in #8964
Describe the solution you'd like
I would like to see cudf::contains updated so it can support this.
Describe alternatives you've considered
I don't think there is an alternative that we can do on our own.
The text was updated successfully, but these errors were encountered: