-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add support for get_json_object #6985
Comments
Is this for loading a JSON file? Or is it for treating the contents of a strings column as JSON text? The answer affects whether or not this is a cuIO feature request. |
Sorry this was not described very well. Full warning this is not simple in the least but we have a lot of customers that really want this.
The example from the spark documentation is
produces the string The I'll try to post all of the operations that path supports and some more examples. The main thing is that I don't know how generic this type of an operator is. I don't see much in pandas that would provide similar functionality. The closest I could come up with is |
OK Here is a bit more information. This function is trying to be compatible with the hive In the case of Hive (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object)
From reading through the Spark code it appears to be the same. If a path does not match a null is returned.
You can access array elements with
If no index is given you get a null back, but
This even works with nesting
So like I said this is not trivial, but it is some what standards based and it is supported by other SQL implementations, which is why we think it belongs in cudf. |
I want to know when this feature probably be supported ? |
@chenrui17 we will target support in RAPIDS 0.19 , with the spark-rapids work in 0.5. Related PR #7286 |
This issue has been labeled |
We are benchmarking the draft PR #7286. This is still active. |
Done |
Is your feature request related to a problem? Please describe.
Allow cudf support for https://spark.apache.org/docs/2.4.5/api/sql/index.html#get_json_object
Describe the solution you'd like
Provide cpp support for get_json_object which extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid.
Describe alternatives you've considered
N/A
The text was updated successfully, but these errors were encountered: