-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactors JSON reader's pushdown automaton (#13716)
This PR simplifies and cleans up the JSON reader's pushdown automaton. The pushdown automaton takes as input two arrays: 1. The JSON's input characters 2. The stack context for each character (`{` - `JSON object`, `[` - `JSON array`, `_` - `Root of JSON`) Previously, we were fusing the two arrays and materializing them straight to the symbol group id for each combination. A symbol group id serves as the column of the transition table. The symbol group ids array was then used as input to the finite state transducer (FST). After the [recent refactor of the FST](#13344) lookup tables, the FST has become more flexible. It now supports arbitrary iterators and the symbol group id lookup table (that maps a symbol to a symbol group id) can now be implemented by a simple function object. This PR takes advantage of the FST's ability to take fancy iterators. We now zip the `json_input` and `stack_context` symbols and pass that `zip_iterator` to the FST. Authors: - Elias Stehle (https://github.com/elstehle) - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Vukasin Milovanovic (https://github.com/vuule) URL: #13716
- Loading branch information
Showing
2 changed files
with
85 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters