-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Changes JSON reader's recovery option's behaviour to ignore all chara…
…cters after a valid JSON record (#14279) Closes #14226. The new behvior of `JSON_LINES_RECOVER` will now ignore excess characters after the first valid JSON record on each JSON line. ``` { "number": 1 } { "number": 1 } xyz { "number": 1 } {} { "number": 1 } { "number": 4 } ``` **Implementation details:** The JSON parser pushdown automaton was changed for `JSON_LINES_RECOVER` format such that when in state `PD_PVL` (`post-value`, "I have just finished parsing a value") and when the stack context is `ROOT` ("I'm not somewhere within a list or struct"), we just treat all characters as "white space" until encountering a newline character. `post-value` in stack context `ROOT` is exactly the condition we are in after having parsed the first valid record of a JSON line. _Thanks to @karthikeyann for suggesting to use `PD_PVL` as the capturing state._ As the stack context is generated upfront, we have to fix up and correct the stack context to set the stack context as `ROOT` stack context for all these excess characters. I.e., (`_` means `ROOT` stack context, `{` means within a `STRUCT` stack context): ``` in: {"a":1}{"this is supposed to be ignored"} stack: _{{{{{{_{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{ ``` Needs to be fixed up to become: ``` in: {"a":1}{"this is supposed to be ignored"} stack: _{{{{{{__________________________________ ``` Authors: - Elias Stehle (https://github.com/elstehle) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) - Karthikeyan (https://github.com/karthikeyann) URL: #14279
- Loading branch information
Showing
4 changed files
with
218 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters