colada-parser
is a parser for Colada, a controlled natural language (CNL) for the Calculus of Inductive Constructions (CIC). It is still experimental and under heavy development.
This work was carried out as part of the Formal Abstracts project.
The project is contained inside src/Colada
. src/Colada/Basic/
(so modules in the Colada.Basic
namespace) contains low-level parser combinators (Core.hs
), utilities for managing the parser state (State.hs
and ParserState.hs
), and low-level parsers for the basic terminals of the grammar (literals, tokens, punctuation, delimiters) (Token.hs
). These are all re-exported by Colada.Basic.Basic
.
The topmost file in the project is Colada.Tests
. This file imports and re-exports every other file in the project, and so should be loaded in a REPL when interactively developing the parser. The highest-level non-testing file is Colada.ProgramText
. This imports all other parts of the grammar. The basic unit of a Colada document is a TextItem
. The entire document must be parsed as a ProgramText
, which wraps a list of TextItem
s.
Colada.PhraseList
contains utilities for dynamically extending the colada-parser
using a simple markup language (tokens postfixed with a question mark can be optionally parsed, and (tk1 | tk2 | ... | tkn)
will parse any of the |
-separated tokens ). Marked-up input is parsed into the ParserMarkUp
datatype, from which colada-parser
generates a parser for later use in the text. The parser which handles the conversion to ParserMarkUp
ignores OCaml-style line and block comments.
To build colada-parser
, you must have the Haskell tool stack
installed (see here). Navigate to the project directory and run stack build
. After building the parser, you can test it on a CNL-compliant text file with stack exec colada path_to/my_file
. If the parse is successful, the parse tree will be written to my_file.parsed
in the same directory as my_file
.
For example, stack exec colada test/test.txt
should succeed.
Note: in the output, error messages may appear twice in a row with the same offset.
From Emacs, enter C-c C-z
from a Haskell-mode buffer with intero-mode
enabled to open the intero
REPL.
Otherwise, run the command
stack ghci --with-ghc ghci "--docker-run-args=--interactive=true --tty=false" --no-build --no-load colada-parser
in a shell from inside the colada-parser
directory.
Ensure that the OverloadedStrings extension is enabled by e.g. entering :set -XOverloadedStrings
. Load all files in the project with :load Colada.Tests
. To test a parser myParser
, enter
test myParser "my input string here"
at the prompt.