-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve path parsing issues in get_json_object
#15082
Resolve path parsing issues in get_json_object
#15082
Conversation
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
However, when encountering an invalid JSONPath, CUDF throws an error, whereas Spark silently returns null. For instance, consider the example
We can update the |
If this is an invalid argument, could the Spark logic just catch the exception and then return a null on its own? |
Yes, I think that can be done. We can have the Spark catch |
Please clarify: In such situations, Spark will return a column of all nulls, or only nulls corresponding to the invalid input rows (JSON objects)? |
I believe the column would consist entirely of null values since an invalid JSONPath (argument) would affect all the input rows uniformly. |
This is dangerous. We cannot know if such exception is due to invalid JSON path or something else unless we have a new exception type dedicated to it. So we should better to handle this case explicitly in the code. |
I believe the exception occurs because of an unexpected parameter value which is checked in host code. |
Signed-off-by: Suraj Aralihalli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I am just looking at the results, not the code as I am no C++ expert
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's refactor the bool inside_brackets
to an enum class
. It would help eliminate the comments that are currently needed to clarify the intent of the argument.
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine for the java side of things.
/ok to test |
/ok to test |
/merge |
This PR addresses a parsing issue related to JSONPath by implementing distinct parsing rules for values inside and outside brackets. For instance, in
{ "A.B": 2, "'A": { "B'": 3 } }
,$.'A.B'
differs from$['A.B']
. (See Assertible JSON Path Documentation)The fix ensures accurate parsing of JSONPath values containing quotes. For example in
{ "A.B": 2, "'A": { "B'": 3 } }
Resolves 12483.
Checklist