-
Notifications
You must be signed in to change notification settings - Fork 38
Significant Whitespace Parsing
John Gietzen edited this page Mar 30, 2019
·
8 revisions
Significant Whitespace parsing is possible in Pegasus by making use of state (#{}
) code regions.
The principal is that you can make a rule (called INDENTATION
below) that looks for a certain number of spaces. This rule can be varied by entering and exiting rules which modify the number of spaces to match.
The gist of the idea is:
program
= #{ state["Indentation"] = 0; } otherRules
INDENTATION
= spaces:" "* &{ spaces.Count == state["Indentation"] }
INDENT
= #{ state["Indentation"] += 4; }
UNDENT
= #{ state["Indentation"] -= 4; }
It would be feasible to use an immutable Stack<int>
rather than a simple int
in order to allow variable sized indentation, but (since .NET doesn't have an immutable generic stack out of the box) this has been omitted for simplicity.
Here is a working prototype of significant whitespace parsing:
Significant.peg:
@namespace PegExamples
@classname SignificantWhitespaceParser
@using Pegasus.Common
@using System.Linq
program <object>
= #{ state["Indentation"] = 0; } s:statements eof { s }
statements
= line+
line <object>
= INDENTATION s:statement { s }
statement <object>
= s:simpleStatement eol { s }
/ "if" _ n:name _? ":" eol INDENT s:statements UNDENT { new { Condition = n, Statements = s } }
/ "def" _ n:name _? ":" eol INDENT s:statements UNDENT { new { Name = n, Statements = s } }
simpleStatement <object>
= a:name _? "=" _? b:name { new { LValue = a, Expression = b } }
name
= n:([a-zA-Z] [a-zA-Z0-9]*) { n }
_ = [ \t]+
eol = _? comment? ("\r\n" / "\n\r" / "\r" / "\n" / eof)
comment = "//" [^\r\n]*
eof = !.
INDENTATION
= spaces:" "* &{ spaces.Count == state["Indentation"] }
INDENT
= #{ state["Indentation"] += 4; }
UNDENT
= #{ state["Indentation"] -= 4; }
Program.cs
using System;
using System.IO;
using Newtonsoft.Json;
using Pegasus.Demos;
namespace PegTest
{
internal class Program
{
private static void Main(string[] args)
{
try
{
var result = new SignificantWhitespaceParser().Parse(File.ReadAllText("test.txt"));
Console.WriteLine(JsonConvert.SerializeObject(result, Formatting.Indented));
}
catch (FormatException ex)
{
Console.WriteLine(ex.Message);
}
Console.ReadKey(true);
}
}
}
Test.txt
a = b
if a:
a = b
if q:
a = z
d = f
b = c
def q:
a = c
c = d