-
Notifications
You must be signed in to change notification settings - Fork 2
Yadriggy Syntax
Syntax checker.
This DSL is used for checking whether an abstract syntax tree
obtained by calling Yadriggy::reify
fits the given syntax.
The syntax is specified in this DSL.
A typical hemi-parasitic DSL implemented with Yadriggy will use only limited syntax of Ruby. Hence the DSL compiler/interpreter has to report a syntax error when it receives unsupported Ruby code, or invalid code in that DSL. Valid Ruby code might be invalid in that DSL. The syntax checker helps DSL developers implement a syntax checker for their DSL.
The following code snippet makes a syntax checker syn
that
accepts only nil
.
syn = Yadriggy::define_syntax do
Reserved <= { name: 'nil' }
end
The syntax rules are given between do
and end
.
In the case above, only one rule is given.
The check
method on the syntax checker returns true
when
the given abstract syntax tree is syntactically valid.
Otherwise, it returns false
.
syn.check(Yadriggy::reify { nil }.tree.body) # true
syn.check(Yadriggy::reify { self }.tree.body) # false
syn.check(Yadriggy::reify { foo }.tree.body) # false
For example, Yadriggy::reify { nil }
returns an ASTree
object
containing an abstract syntax tree. ASTree#tree
returns the
abstract syntax tree. It is a Block
object representing
the block { nil }
. Block#body
returns the body of that block,
which is a Reserved
object representing an abstract syntax tree
for the reserved word nil
. The check
method receives such an
abstract syntax tree and determines whether it is syntactically
valid or not.
The following is a somewhat larger example:
syn = Yadriggy::define_syntax do
Binary <= { op: :+ | :-, left: expr, right: expr }
expr <= Binary | Number
Block <= { body: expr }
end
expr
is not an AST-node type but a user type described later.
This accepts only addition and subtraction expressions with
numbers as their operands. It is compatible to the BNF definition
below:
binary : expr '+' expr | expr '-' expr
expr : binary | number
block : expr
Note that the syntax rules above are not used for parsing. They are used for checking the validity. Hence their ambiguity does not matter. See the results of the syntax checking:
syn.check(Yadriggy::reify { 1 }.tree) # true
syn.check(Yadriggy::reify { 1 + 2 }.tree) # true
syn.check(Yadriggy::reify { 1 + 2 - 3 }.tree) # true
syn.check(Yadriggy::reify { 1 * 2 - 3 }.tree) # false
syn.check(Yadriggy::reify { 1 + a }.tree) # false
A syntax rule is written in a hemi-parasitic DSL that is also implemented with Yadriggy. It takes the following form:
type_name <= constraint
The left operand of <=
is a type name of abstract syntax tree (AST)
node such as Block
and Reserved
. All the node types are
subclasses of ASTnode
. The operator can be not only <=
but
also =
.
The syntax rule specifies that an AST node has to
satisfy the constraint in the right operand of <=
when the
type of that AST node matches the left operand of <=
.
The syntax checker first attempts to find a rule for the class type
of that AST node. If the rule is not found, then it attempts
to find a rule for the super class.
If no syntax rule for the given AST node is found at all,
a syntax error is reported.
The syntax checker supports several kinds of constraints.
constraint | specification |
---|---|
type name | the AST node is a value of that type and it satisfies the constraint for that type. |
hash literal | the attributes of the AST node satisfies the given constraints. |
nil |
the AST node does never exist. |
c1 + c2
|
Both constraints c1 and c2 are satisfied. |
c1 | c2
|
Ordered choice. Either c1 or c2 is satisfied. c1 is tested first. |
For example,
Identifier <= Name
Identifier
is a class for ASTnodes. It is a subclass of Name
.
The rule above specifies that an Identifier
node satisfies the
constraints for Name
. The syntax checker looks for a rule
whose left operand is Name
. If it finds the rule, it checks whether
the constraint of that rule is satisfied or not. If the constraint is
not satisfied, a syntax error will be reported.
If the rule for the type name in the right operand is not found,
no further syntax checking is performed. For the example above,
if the syntax checker cannot find a rule whose left operand is Name
,
then the Identifier
node satisfies the constraint; no syntax error
is reported.
A hash literal specifies a constraint that the attributes of an AST node have to satisfy. For example,
Reserved <= { name: 'nil' }
This specifies that the name
attribute of a Reserved
AST
node has to be a character string 'nil'
. In other words,
Reserved#name
has to return 'nil'
.
The attributes that are not included in the hash literal can be any value. So,
Reserved <= {}
This specifies that no constraint is given to Reserved
;
any AST node of type Reserved
is acceptable.
The hash value for each key is the constraint for the
corresponding attribute.
It can be either a string literal,
a symbol literal, nil
, an array literal, or a type name.
It can be surrounded with parentheses.
The constraints can be combined
with an ordered choice operator |
.
hash value | specification |
---|---|
string literal | the value of the attribute matches that string literal. |
symbol literal | the value of the attribute matches that symbol literal. For example, Binary#op (operator) is a symbol. |
nil |
the attribute is nil or an empty array. |
array literal | the attribute is an array and the elements match the constraint. |
type name | the attribute is a value of that type. |
( c )
|
the constraint c is optional. The attribute is nil or satisfies c. |
c1 | c2
|
ordered choice. The attribute satisfies either c1 or c2. |
The constraint surrounded with parentheses specifies that
it is optional; the attribute is nil
or it satisfies that
constraint.
An array literal specifies that the attribute is an array. It takes the following forms:
array literal | specification |
---|---|
[c] | all the array elements match the constraint c. |
[c0, c1] | the first element matches c0 and the rest of the elements match c1. |
[c0, c1, c2] | the first two elements match c0 and c1, respectively. The rest of the elements match c2. |
[c1 * c2] | Each array element is an array with two AST nodes. The first element of each array element matches c1 and the second element matches c2. |
c | the array contains a single element. It matches c. |
The followings are examples.
Exprs <= { expressions: [ Binary ] }
HashLiteral <= { pairs: [ Label * Number ] }
Return <= { values: Number }
An Exprs
node has the attribute expressions
and this attribute
has to be an array of Binary
nodes.
The pairs
attribute of a HashLiteral
node has to be an array.
Each element of this array has to be an array whose first element
is a Label
node and whose second element is a Number
object.
The values
attribute of a Return
node is an array but the
constraint above specifies that the size of the array is one
and its element is a Number
node.
In Ruby, a type name starts with a upper case. In the syntax rules, a type name starting with a lower case is also available. It is called a user type.
For example,
expr <= Number | Identifier
Exprs <= { expressions: [ expr ] }
The first rule specifies the user type expr
.
It is either Number
or Identifier
.
The second rule specifies that an Exprs
node
has the attribute expressions
.
The value of expressions
is an array of the expr
type.
Each array element is either
a Number
node or an Identifier
node.
A user type can be used as a subtype of AST-node type:
add_expr <= Binary + { op: :+ }
A value of the user type add_expr
is an AST node that is
an instance of Binary
and whose op
attribute is :+
.
When the syntax checker finds an AST node of a user type,
or in other words, an AST node that satisfies the constraints
given by the user type,
then the user type is attached to that AST node.
The attached user type can be obtained
by calling ASTnode#usertype
on the AST node
after syntax checking.
For the example above,
expr <= Number | Identifier
Exprs <= { expressions: [ expr ] }
When the syntax checker finds an Exprs
node matching the
rules above, then the expr
type is attached
to all the array elements of the expressions
attribute.
When an AST node matches multiple user types, the attached type to that AST node is the innermost user type matching the last. For example,
expr <= foo_name | VariableCall
foo_name <= VariableCall + { name: "foo" }
An AST for foo
is a VariableCall
node.
It has the user type foo_name
although
it first matches the user type expr
.
Note that |
is an ordered choice operator.
The left operand of |
is checked before the right operand.
An AST for bar
has the user type expr
since its
name
attribute is not "foo"
but "bar"
.
A hash literal in a syntax rule specifies the constraint that the attributes of an AST node have to satisfy. If the hash value in the hash literal is a (not user-)type name, the value of the corresponding attribute has to be of that type. It also have to satisfy the constraint for that type.
For example,
Unary <= { op: Symbol, operand: Name }
VariableCall <= Name
Reserved <= nil
This syntax rules specifies that the operand of
a Unary
node is an instance of Name
or its subclass
except Reserved
. Note that subclassing is considered.
The first rule specifies that the operand
attribute of
an Unary
node has to be a value of Name
, which is a subclass
of ASTnode
.
If the attribute value is not Name
, then a syntax error will be
reported.
If the attribute value is Name
, then the syntax checker
recursively checks whether that attribute value satisfies the
syntax rules. This syntax checking considers subclasses.
For example, the AST for -x
is a Unary
node with
a VariableCall
node for the operand
attribute.
This AST satisfies the first syntax rule above since VariableCall
is a subclass of Name
.
Then the syntax checker checks whether this VariableCall
node satisfies
the syntax rules.
It looks for a rule for VariableCall
and, if it finds
the rule, it checks whether the VariableCall
node satisfies
that rule. Note that it looks for a rule for
Name
only when it does not find the rule for VariableCall
.
The AST for -Size
is a Unary
node with a Const
node
for the operand
attribute. This AST satisfies the first syntax
rule above. Then the syntax checker looks for a rule
for Const
. Since there is no rule for Const
,
that operand
attribute of class Const
passes the
syntax checking.
The AST for -self
is a Unary
node with a Reserved
node
for the operand
attribute since self
is a reserved word.
This AST satisfies the first syntax rule above but the syntax
checker reports a syntax error when it checks the third syntax
rule. Since the operand
attribute is a Reserved
node,
the syntax checker selects the third rule for checking
the attribute value. However, the right operand of this rule is nil
.
This specifies any AST containing a Reserved
node causes a syntax error.