Skip to content

Latest commit

 

History

History
155 lines (118 loc) · 4 KB

parser.md

File metadata and controls

155 lines (118 loc) · 4 KB

Chapter 3: Parser Layout

Back: Structure&AST | Chapters | Next: Parser Implementation


Let's first define a parser structure, which helps us traverse the input, skipping whitespaces, aswell as generating a helpful error message if we unexpectedly reach the end of input.

Parser
class Parser:
    def __init__(self, text: str):
        self.text = text # the input text expression
        self.index = 0   # the index we are currently at
    
    def next(self):
        self.index += 1
        # let's skip the whitespaces - we do not have any need for those
        # and they just complicate parsing.
        while self.has_current() and self.current().isspace():
            self.index += 1

    def current(self) -> str:
        if not self.has_current():
            raise ParseException('Unexpected end of input', self)
        return self.text[self.index]

    # are we still in range?
    def has_current(self) -> bool:
        return self.index < len(self.text)

We can now iterate over the input like this:

Example
from calculator.parser import Parser

parser = Parser('sin(pi * x + 4)')
out = ''
while parser.has_current():
    out += parser.current()
    parser.next()
print(out)
sin(pi+4)*x

We can also define an exception class to inform the user about syntax errors

ParseException
class ParseException:
    def __init__(self, message: str, parser: 'Parser'):
        self.message = message
        # passing along the source will give us helpful metadata 
        # to format our exception (a task for later)
        self.parser = parser

    def __str__(self):
        return f'ParseException: {self.message}'

Our parser needs a parsing method for each AST node, aswell as a generic parse_expression node to switch between the node types.

Parser
class Parser:
    ...
    def parse_expression(self) -> AST:
        raise ParseException('Unimplemented!', self)

    def parse_value(self) -> Value:
        raise ParseException('Unimplemented!', self)

    def parse_variable(self) -> Variable:
        raise ParseException('Unimplemented!', self)

    def parse_func_call(self) -> FuncCall:
        raise ParseException('Unimplemented!', self)

    def parse_binary_op(self) -> BinaryOp:
        raise ParseException('Unimplemented!', self)

    def parse_unary_op(self) -> UnaryOp:
        raise ParseException('Unimplemented!', self)

And finally with all that we can define the parsing entrypoint

parse
def parse(text: str) -> AST:
    parser = Parser(text)
    ast = parser.parse_expression()
    if parser.has_more():
        raise ParseException('Still more to parse', parser)
    return ast

Now we can finally run the parser.
It, of course, fails immediately:

calculator.parser.ParseException: Unimplemented!

An immediate problem that we can see is that we dont know where we failed. This makes debugging, both when coding and using the parser very difficult.

Luckily, our exception class knows where we are since we give it our Parser object on construction.
This makes it easy to adjust the __str__ method.

ParseException
class ParseException(Exception):
    ...
    def __str__(self):
        message = self.message
        text =  '| ' +self.parser.text
        arrow = '|-' + '-'*self.parser.index + '^'
        return f'{message}\n{text}\n{arrow}'

We now show the source text aswell as an arrow to the specific location where the error occured.

calculator.parser.ParseException: Unimplemented!
| sin(pi * x + 4)
|-^

This is much better! Now we can actually implement the parsing.


Back: Structure&AST | Chapters | Next: Parser Implementation