Skip to content

TUTO~2d: parser, version 2

famished-tiger edited this page Feb 6, 2022 · 1 revision

Unchanged parser!

That's right: the parser for this iteration is identical to the previous one.
How is this even possible?
Well, given a grammar and an approriate tokenizer, Rley is capable to parse the input and to convert it into a parse tree.

For the skeptics, here again the code of the parse method from the parser:

  def parse(source)
    tokenizer =
    result = engine.parse(tokenizer.tokens)

    unless result.success?
      # Stop if the parse failed...
      line1 = "Parsing failed\n"
      line2 = "Reason: #{result.failure_reason.message}"
      raise SyntaxError, line1 + line2


The structure of the snippet is independent of the language. The real language dependency is isolated in the grammar and the tokenizer class.

Demo time

The iteration 2 source code files comprises a small demo script: toml_demo.rb.

The script parses the following TOML text:

  # This is a TOML document

  title = "TOML Example"

  name = "Thomas O'Malley"

  enabled = true
  ports = [ 8000, 8001, 8002 ]
  data = [ ["delta", "phi"], [3.14] ]

and renders the related parse tree into an ASCII tree:

+-- expr-list
    +-- expr-list
    |   +-- expr-list
    |   |   +-- expr-list
    |   |   |   +-- expr-list
    |   |   |   |   +-- expr-list
    |   |   |   |   |   +-- expr-list
    |   |   |   |   |   |   +-- expr-list
    |   |   |   |   |   |   +-- expression
    |   |   |   |   |   |       +-- keyval
    |   |   |   |   |   |           +-- key
    |   |   |   |   |   |           |   +-- UNQUOTED-KEY: 'title'
    |   |   |   |   |   |           +-- EQUAL: '='
    |   |   |   |   |   |           +-- val
    |   |   |   |   |   |               +-- STRING: '"TOML Example"'
    |   |   |   |   |   +-- expression
    |   |   |   |   |       +-- table
    |   |   |   |   |           +-- std-table
    |   |   |   |   |               +-- LBRACKET: '['
    |   |   |   |   |               +-- key
    |   |   |   |   |               |   +-- UNQUOTED-KEY: 'owner'
    |   |   |   |   |               +-- RBRACKET: ']'
    |   |   |   |   +-- expression
    |   |   |   |       +-- keyval
    |   |   |   |           +-- key
    |   |   |   |           |   +-- UNQUOTED-KEY: 'name'
    |   |   |   |           +-- EQUAL: '='
    |   |   |   |           +-- val
    |   |   |   |               +-- STRING: '"Thomas O'Malley"'
    |   |   |   +-- expression
    |   |   |       +-- table
    |   |   |           +-- std-table
    |   |   |               +-- LBRACKET: '['
    |   |   |               +-- key
    |   |   |               |   +-- UNQUOTED-KEY: 'database'
    |   |   |               +-- RBRACKET: ']'
    |   |   +-- expression
    |   |       +-- keyval
    |   |           +-- key
    |   |           |   +-- UNQUOTED-KEY: 'enabled'
    |   |           +-- EQUAL: '='
    |   |           +-- val
    |   |               +-- BOOLEAN: 'true'
    |   +-- expression
    |       +-- keyval
    |           +-- key
    |           |   +-- UNQUOTED-KEY: 'ports'
    |           +-- EQUAL: '='
    |           +-- val
    |               +-- array
    |                   +-- LBRACKET: '['
    |                   +-- array-values
    |                   |   +-- array-values
    |                   |   |   +-- array-values
    |                   |   |   |   +-- val
    |                   |   |   |       +-- INTEGER: '8000'
    |                   |   |   +-- COMMA: ','
    |                   |   |   +-- val
    |                   |   |       +-- INTEGER: '8001'
    |                   |   +-- COMMA: ','
    |                   |   +-- val
    |                   |       +-- INTEGER: '8002'
    |                   +-- RBRACKET: ']'
    +-- expression
        +-- keyval
            +-- key
            |   +-- UNQUOTED-KEY: 'data'
            +-- EQUAL: '='
            +-- val
                +-- array
                    +-- LBRACKET: '['
                    +-- array-values
                    |   +-- array-values
                    |   |   +-- val
                    |   |       +-- array
                    |   |           +-- LBRACKET: '['
                    |   |           +-- array-values
                    |   |           |   +-- array-values
                    |   |           |   |   +-- val
                    |   |           |   |       +-- STRING: '"delta"'
                    |   |           |   +-- COMMA: ','
                    |   |           |   +-- val
                    |   |           |       +-- STRING: '"phi"'
                    |   |           +-- RBRACKET: ']'
                    |   +-- COMMA: ','
                    |   +-- val
                    |       +-- array
                    |           +-- LBRACKET: '['
                    |           +-- array-values
                    |           |   +-- val
                    |           |       +-- FLOAT: '3.14'
                    |           +-- RBRACKET: ']'
                    +-- RBRACKET: ']'

We reached the end of this iteration: we created a parser that covers a significant part of the TOML language.
In iteration 3, we'll progress further in the lcoverage of the language.