Skip to content

TUTO~1b: On key value pairs

famished-tiger edited this page Feb 18, 2022 · 6 revisions

The TOML format is centred on key-value pairs. In fact, a TOML file is basically a hash (aka map or associative array).
Therefore, the following line:

title = "TOML Example"

must not be seen as an assignment of the variable title with the string value "TOML Example"; on the contrary, it is a key-value pair where title is the key and the text "TOML Example" the associated value.

Keys and values will be implemented with their specific classes TOMLKey and TOMLDatatype.

The TOML key class hierarchy

TOML defines different "flavours" of keys, hence the class UnquotedKey was made a subclass of TOMLKey class.
Key class hierarchy

The TOML Datatype class hierarchy

As shown below, to each TOML data type supported in iteration 1 corresponds a dedicated class.
All the classes are descendent from the TOMLDatatype class; they form a hierarchy, depicted below.

TOMLDatatype hierarchy

The TOMLDatatype class

The next class diagram provides an overview of the TOMLDatatype class.

TOMLDatatype class

The class defines an attribute, value, that is initialized in its sub-classes by a core Ruby class that represents the corresponding TOML value.
For instance, the TOMLBoolean class will initialize the value with the Ruby true or false according to its TOML counterpart. The TOMLDatatype hierarchy can be seen as an implementation of the Adapter design pattern.

Let's look quickly to some methods of TOMLDatatype:

  • initialize method takes two arguments: a String that is the TOML representation of a literal value and an optional Symbol argument that helps to cope with different formats (say, a integer in hexadecimal format).
    Internally, that method calls the protected method validated_value that can be overidden in sub-classes.
  • == method, useful to test the equality between two TOML data values.
  • to_str returns a text representation of the data value.
  • accept method, used to implement the Visitor design pattern. More on this in iteration 4.

[Design note] Is this worth?

Defining a class for every TOML data type that is close its Ruby counterpart, is that not overkill?
In the context of a TOML parser, it is indeed true that the datatype classes bring limited added value.
We could live without them...
But this is less true if your goal is to build a full-fledged programming language interpreter or compiler.
So, what are pros and cons of using a Datatype class hierarchy?

On the downside, we have:

  • More classes to implement
  • Some classes may add little functionality compared to the Ruby built-in data types.

Op the advantage side:

  • Simplified tokenizer implementation by hiding transformation details of TOML data literal representation into a Ruby value object.
  • Improved modularity: improves testability
  • Room for adding specific behavior required by your programming language interpreter / compiler.

This concludes the implementation of the TOML data types.
For the curious, the source file can be reached here.

In the next installment, we'll see how to implement an initial tokenizer.