-
Notifications
You must be signed in to change notification settings - Fork 2
Generated File Contents
This page attempts to provide detail about the contents that you’ll find in a goll1e
-generated parser. Each section below describes, in order, the pieces that you’ll find in the output.
The yystype
block contains the declaration for the value n-tuple of which any one piece a terminal or nonterminal can adopt. Because Go does not have union language structures, the parser uses a struct to describe yystype
.
The yytable
variable contains the parser routing table. Assuming that len(t)
describes the number of terminals in your grammar and len(nt)
describes the number of nonterminals in your grammar, then the size of yytable
will equal [len(t) + 1] * len(nt)
.
The rows in yytable
correspond to the nonterminals in your grammar and the columns correspond to terminals.
Entries in the yytable
greater than -1 refer to the number of a production defined in the Rules Section of your grammar. These correspond to entries in the yyprods
table. Values of -1 in the yytable
signify an error in the input to the parser.
The yycharmap
provides mappings from character literals listed in the Rules Section of the grammar file to input symbols matched by the parser.
-
yyMINTOKEN
provides a value that separates character literals and terminals in the input grammar -
yyMAXTOKEN
provides a value that separates terminals and nonterminals in the input grammar -
yyUSER
provides the a value that separates nonterminals and user-provided input values - Constants with a value between
yyMINTOKEN
andyyMAXTOKEN
provide the terminals a specific integral value
The yyprods
table describes the input literals, terminals, and nonterminals. Entries in the yyprods
table have the following meanings.
- If the entry is less than
yyMINTOKEN
, then that value represents an entry in theyycharmap
- If the entry exists in the exclusive interval
yyMINTOKEN@, @yyMAXTOKEN
, then that value represents a terminal - If the entry exceeds the value of
yyMAXTOKEN
, then that value represents a nonterminal the value of which you can determine byentry value - yyMAXTOKEN
The yyrunrule
function executes the syntax-directed translations contained in the Rules Section of your grammar definition file. Each case in the switch statement corresponds with the order of each production contained in the Rules Section.
The yytranslate
function converts provided token values into zero-based values of terminals, nonterminals, and character literals, respectively.
The yyparse
function does just that, parses the input that it gets from the supplied nextWord
function. This differs from the way that YACC works, which will call a global int yylex()
function. I like this design better because it allows the use of fewer global variables and allows, at runtime, to switch out implementations of the scanner.
yyparse
also expects a value that represents the EOF value. This removes conflicts that can occur between different scanner implementations and their respective EOF value.
yyparse
returns a flag that indicates success. If it does succeed, then the second return value will contain the value assigned to the start node after all rule running has occurred.
The following sections dissect the yyparse
function and detail the inner workings of it.
A stack of *yystype*s that contains the resolution stack during the rule execution at the end of yyparse
.
The inputs
variable contains the recognized inputs as they get recognized and resolved through the parser.
For terminals, which are the values contained in the input and valued by the *yystype
provided the nextWord
function, the values
variable acts as a stack containing the results. Each time the parser calls newWord
it checks the result; if the result lies between yyMINTOKEN
and yyMAXTOKEN
, the parser puts the *yystype
into values
.
word
contains the most recently read int from the nextWord
function.
stack
provides the LL(1) its parsing workspace for resolving rules.
tos
represents the value on the top of the stack at any time and exists only for readability.
In pseudocode, the parsing routine follows the following logic.
loop forever
if the top of the stack equals the end-of-file value and the value of current word equals the end-of-file value
set the value of the function to *true*
exit the loop
if the top of the stack is either the end-of-file value or a terminal or a character literal
if the value of the top of the stack equals the value of the current word
push the current word onto the inputs stack
pop the resolution stack twice
get the next word of the input
push the value of the scan onto the values stack if the word represents a terminal
set the new top of stack value
else the top of the stack does not equal the current word
exit the loop because the parser doesn't consider the input a valid word
else the top of the stack represents a nonterminal
translate the top of stack and the word into row and column values used by the parsing routing table
look up the rule number to apply in the parsing routing table
if the rule number is -1
exit the loop because the input cannot match a production rule
push the translated rule number onto the inputs stack
pop the resolution stack twice to remove the just used nonterminal
for each entry in the production rules table for the rule number from last to first
push the entry onto the resolution stack
set the new top of stack value
The previous parsing populates the inputs
and values
stacks with the values needed to run the rules contained in the Rules Section of the grammar definition file. If the parser recognized the provided input, then it runs the rules by adhering to the algorithm as detailed in the following pseudocode.
while the inputs stack has entries
pop the top entry in the inputs stack
if the value is a character literal
push nil onto the resolution stack
else if the value is a terminal
pop the top entry from the values stack
push that value onto the resolution stack
else if the value is a rule number
create a new resolution value
run the code associated with the rule number with the new resolution value
pop the number of entries in the rule from the resolution stack
push the new resolution value onto the resolution stack
When the rule running finishes, a single value that represents the completed computation of the grammar will exist as the only entry in the resolution stack. yyparse
then uses that value to return a result.