Skip to content

TUTO~2d: parser, version 2

famished-tiger edited this page Feb 6, 2022 · 1 revision

Unchanged parser!

That's right: the parser for this iteration is identical to the previous one.
How is this even possible?
Well, given a grammar and an approriate tokenizer, Rley is capable to parse the input and to convert it into a parse tree.

For the skeptics, here again the code of the parse method from the parser:

  def parse(source)
    tokenizer = TOMLTokenizer.new(source)
    result = engine.parse(tokenizer.tokens)

    unless result.success?
      # Stop if the parse failed...
      line1 = "Parsing failed\n"
      line2 = "Reason: #{result.failure_reason.message}"
      raise SyntaxError, line1 + line2
    end

    engine.convert(result)
  end

The structure of the snippet is independent of the language. The real language dependency is isolated in the grammar and the tokenizer class.

Demo time

The iteration 2 source code files comprises a small demo script: toml_demo.rb.

The script parses the following TOML text:

  # This is a TOML document

  title = "TOML Example"

  [owner]
  name = "Thomas O'Malley"

  [database]
  enabled = true
  ports = [ 8000, 8001, 8002 ]
  data = [ ["delta", "phi"], [3.14] ]

and renders the related parse tree into an ASCII tree:

toml
+-- expr-list
    +-- expr-list
    |   +-- expr-list
    |   |   +-- expr-list
    |   |   |   +-- expr-list
    |   |   |   |   +-- expr-list
    |   |   |   |   |   +-- expr-list
    |   |   |   |   |   |   +-- expr-list
    |   |   |   |   |   |   +-- expression
    |   |   |   |   |   |       +-- keyval
    |   |   |   |   |   |           +-- key
    |   |   |   |   |   |           |   +-- UNQUOTED-KEY: 'title'
    |   |   |   |   |   |           +-- EQUAL: '='
    |   |   |   |   |   |           +-- val
    |   |   |   |   |   |               +-- STRING: '"TOML Example"'
    |   |   |   |   |   +-- expression
    |   |   |   |   |       +-- table
    |   |   |   |   |           +-- std-table
    |   |   |   |   |               +-- LBRACKET: '['
    |   |   |   |   |               +-- key
    |   |   |   |   |               |   +-- UNQUOTED-KEY: 'owner'
    |   |   |   |   |               +-- RBRACKET: ']'
    |   |   |   |   +-- expression
    |   |   |   |       +-- keyval
    |   |   |   |           +-- key
    |   |   |   |           |   +-- UNQUOTED-KEY: 'name'
    |   |   |   |           +-- EQUAL: '='
    |   |   |   |           +-- val
    |   |   |   |               +-- STRING: '"Thomas O'Malley"'
    |   |   |   +-- expression
    |   |   |       +-- table
    |   |   |           +-- std-table
    |   |   |               +-- LBRACKET: '['
    |   |   |               +-- key
    |   |   |               |   +-- UNQUOTED-KEY: 'database'
    |   |   |               +-- RBRACKET: ']'
    |   |   +-- expression
    |   |       +-- keyval
    |   |           +-- key
    |   |           |   +-- UNQUOTED-KEY: 'enabled'
    |   |           +-- EQUAL: '='
    |   |           +-- val
    |   |               +-- BOOLEAN: 'true'
    |   +-- expression
    |       +-- keyval
    |           +-- key
    |           |   +-- UNQUOTED-KEY: 'ports'
    |           +-- EQUAL: '='
    |           +-- val
    |               +-- array
    |                   +-- LBRACKET: '['
    |                   +-- array-values
    |                   |   +-- array-values
    |                   |   |   +-- array-values
    |                   |   |   |   +-- val
    |                   |   |   |       +-- INTEGER: '8000'
    |                   |   |   +-- COMMA: ','
    |                   |   |   +-- val
    |                   |   |       +-- INTEGER: '8001'
    |                   |   +-- COMMA: ','
    |                   |   +-- val
    |                   |       +-- INTEGER: '8002'
    |                   +-- RBRACKET: ']'
    +-- expression
        +-- keyval
            +-- key
            |   +-- UNQUOTED-KEY: 'data'
            +-- EQUAL: '='
            +-- val
                +-- array
                    +-- LBRACKET: '['
                    +-- array-values
                    |   +-- array-values
                    |   |   +-- val
                    |   |       +-- array
                    |   |           +-- LBRACKET: '['
                    |   |           +-- array-values
                    |   |           |   +-- array-values
                    |   |           |   |   +-- val
                    |   |           |   |       +-- STRING: '"delta"'
                    |   |           |   +-- COMMA: ','
                    |   |           |   +-- val
                    |   |           |       +-- STRING: '"phi"'
                    |   |           +-- RBRACKET: ']'
                    |   +-- COMMA: ','
                    |   +-- val
                    |       +-- array
                    |           +-- LBRACKET: '['
                    |           +-- array-values
                    |           |   +-- val
                    |           |       +-- FLOAT: '3.14'
                    |           +-- RBRACKET: ']'
                    +-- RBRACKET: ']'

We reached the end of this iteration: we created a parser that covers a significant part of the TOML language.
In iteration 3, we'll progress further in the lcoverage of the language.