You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
get_json_object and json_tuple may output results in one of two ways. It may output the data with all escaped characters processed into their unescaped equivalents. i.e. "\u0000" becomes the NUL character. This happens with then only thing matched is a single quoted string value.
It can also output the escaped/normalized data. This happens if a string is a part of a nested object that matches. In these cases Spark is outputting normalized JSON for the section that matched.
Our implementation is not outputting the normalized escaped strings correctly in all cases.
Currently \u encoded characters that are control characters (< decimal 32) are not escaped at all. They are just output as the unescaped value. Spark will output the value as a \u escaped sequence. For the special escape sequences like \b \f \r \n, etc. Those should be favored over the \u versions. But I would be happy to live with the \u versions as they are still technically valid JSON.
The text was updated successfully, but these errors were encountered:
Describe the bug
get_json_object and json_tuple may output results in one of two ways. It may output the data with all escaped characters processed into their unescaped equivalents. i.e. "\u0000" becomes the NUL character. This happens with then only thing matched is a single quoted string value.
It can also output the escaped/normalized data. This happens if a string is a part of a nested object that matches. In these cases Spark is outputting normalized JSON for the section that matched.
Our implementation is not outputting the normalized escaped strings correctly in all cases.
Currently \u encoded characters that are control characters (< decimal 32) are not escaped at all. They are just output as the unescaped value. Spark will output the value as a \u escaped sequence. For the special escape sequences like \b \f \r \n, etc. Those should be favored over the \u versions. But I would be happy to live with the \u versions as they are still technically valid JSON.
The text was updated successfully, but these errors were encountered: