I'd initially wanted to hide the code generation aspects of this
project and just commit the generated .c
and .h
files in the src/
directory. However this is disingenuous and probably hinders other people
playing with the code, so I've made it official.
There always was a generated/
directory created by make
that hosts the
generated flex and bison output, so it was simple enough to generate
additional code into there instead of in to src/
, and remove those
generated files from git.
So what's the code generation for? It just removes the need to maintain
a ton of boilerplate code around structures used by the project. There
are a number of .yaml
files in the src/
directory which basically
declare C structs and their typedefs. At the time of writing they are:
anf.yaml
A-Normal form structures input to the bytecode compiler, generated fron the lambda structures.ast.yaml
The abstract syntax tree generated by the parser.builtins.yaml
Support for registering built-ins.lambda.yaml
Lambda calculus-like structures generated from the AST.tc.yaml
Type checking support for Algorithm W.tpmc.yaml
Term Pattern Matching Compiler support structures, part of lambda conversion.
For example ast.yaml
contains the declarations for the
abstract syntax tree generated by the parser. A python script
makeAST.py is given each of those yaml files and
generates the same set of .c
and .h
files for each. Continuing with
the ast.yaml
example, from that file will be generated:
generated/ast.c
a number of different functions for each structure:new<struct>()
functions that allocate memory and poulate the allocated structs with argument values.copy<struct>()
functions that will make a deep copy of the struct.push<struct>()
functions that will push data onto any declared 1-dimensional arrays.mark<struct>()
functions that will recursively mark the structures as part of garbage collection.- a generic
mark
function that will switch on the type and call the correctmark
function. free<struct>
functions that will release unused memory when requested by the garbage collection system- a generic
free
function that dispatches to the correctfree<struct>
function. - a
typename
function that will return the name of a struct for debugging etc.
generated/ast_debug.c
debugging utilities, namely:print<struct>()
functions that will recursively display a representation of the struct for debugging.eq<struct>()
functions that perform deep comparisons for testing and debugging.
generated/ast_debug.h
header forast_debug.c
generated/ast.h
header forast.c
includes the structure declarations themselves.generated/ast_objtypes.h
macros collecting the enums and case statements that can then easily be incorporated into the memory management system.docs/ast.md
A mermaid graph of the structs and their relationships (WiP, occasionally useful).
This all means that it's relatively easy to make fairly sweeping changes to the various trees without all the tedious re-writing of the above.