-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiler restructuring #1
base: main
Are you sure you want to change the base?
Commits on Jan 13, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 763de2e - Browse repository at this point
Copy the full SHA 763de2eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 42c237a - Browse repository at this point
Copy the full SHA 42c237aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9c443c1 - Browse repository at this point
Copy the full SHA 9c443c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 48124f4 - Browse repository at this point
Copy the full SHA 48124f4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d8132d - Browse repository at this point
Copy the full SHA 9d8132dView commit details -
I've reworked most of the compiler in order to make it more maintainable and less complex. The new compiler has a text based IR which allows for a modular architecture. This will allow the user to plug in their own passes easily and choose the ordering of existing passes etc. Currently, there is no codegen for x86-64 implemented but this is just a foundation for the compiler going forward. Plans are to implement another IR in three-address-code form for backends to consume which should ease register allocation and lowering to assembly. - New modular architecture - Simplified implementation - Stack based IR with textual format
Configuration menu - View commit details
-
Copy full SHA for d1fcad1 - Browse repository at this point
Copy the full SHA d1fcad1View commit details
Commits on Jan 16, 2022
-
Renamed core words to cp, mv & rm and added stdlib.klx.
Renamed the core words to try and avoid too many naming conflicts but also to describe their intent better. Added a very basic stdlib file which gives access to some common kind of arithmetic and stack manipulation words. Updated the syntax highlighting file in accordance with the above and also added a new highlighting file for the klaxon IR format (KIR). Renamed the compile.sh script to klx and we also run the m4 preprocessor on the source file before passing it to klaxon to enable use of include and macros. Fixed an issue when parsing type annotations. Previously, annotations which had no out values would cause the compiler to generate an error that it expected an identifier but returning no values is valid. Added a dedicated locale string for type annotation errors. Removed extraneous space being printed in the KIR serialiser.
Configuration menu - View commit details
-
Copy full SHA for d3ce1bc - Browse repository at this point
Copy the full SHA d3ce1bcView commit details
Commits on Jan 17, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 8b231c9 - Browse repository at this point
Copy the full SHA 8b231c9View commit details -
Configuration menu - View commit details
-
Copy full SHA for e1ee9f3 - Browse repository at this point
Copy the full SHA e1ee9f3View commit details -
Print formatting previously would not escape sequences of closing braces correctly. For example: `printlnfmt("{}}}", "foo");` _should_ have produced `foo}` but instead produced `foo`.
Configuration menu - View commit details
-
Copy full SHA for 1c6f083 - Browse repository at this point
Copy the full SHA 1c6f083View commit details
Commits on Jan 18, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 5f6fc82 - Browse repository at this point
Copy the full SHA 5f6fc82View commit details -
Remove arg & out instructions and fixed loop code gen
Removed the arg and out instructions in favour of just keeping the cp, mv and rm instructions around for the backend to work with. Loops had incorrect code generation.
Configuration menu - View commit details
-
Copy full SHA for 856464d - Browse repository at this point
Copy the full SHA 856464dView commit details -
Add indirect branch elimination pass
Added a new optimisation pass to collapse indirect jumps to a block which then unconditionally jumps to another block. This is very useful for heavily nested if-else chains. Give up when trying to do constant folding beyond a function call. The problem with trying to fold beyond the bounds of a call is that the function call itself may produce some values that are only known at compile time in which case there is nothing for the constant folder to reduce at compile time. Removed any notion of arg and out instructions.
Configuration menu - View commit details
-
Copy full SHA for 0e3a088 - Browse repository at this point
Copy the full SHA 0e3a088View commit details -
Configuration menu - View commit details
-
Copy full SHA for c9acdb6 - Browse repository at this point
Copy the full SHA c9acdb6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d1540b - Browse repository at this point
Copy the full SHA 5d1540bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 41980ee - Browse repository at this point
Copy the full SHA 41980eeView commit details -
Configuration menu - View commit details
-
Copy full SHA for ee4ba78 - Browse repository at this point
Copy the full SHA ee4ba78View commit details -
Configuration menu - View commit details
-
Copy full SHA for 878c39e - Browse repository at this point
Copy the full SHA 878c39eView commit details
Commits on Jan 19, 2022
-
Configuration menu - View commit details
-
Copy full SHA for d6aca50 - Browse repository at this point
Copy the full SHA d6aca50View commit details
Commits on Jan 22, 2022
-
Configuration menu - View commit details
-
Copy full SHA for e088654 - Browse repository at this point
Copy the full SHA e088654View commit details -
Configuration menu - View commit details
-
Copy full SHA for eab1444 - Browse repository at this point
Copy the full SHA eab1444View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9f28215 - Browse repository at this point
Copy the full SHA 9f28215View commit details
Commits on Feb 1, 2022
-
Cleanup and simplification of lib.hpp & stack effect annotations in t…
…he IR Unified Tokens, Ops and IR Tokens into a single enum class so we longer need to do pesky mappings between them. We can just use the same enum value right through from the lexer to the IR generation. Merged the lexer implementation for the IR and source representations into the same class and now just use a templated flag to pick the implementation we want which reduces a lot of code duplication. Added some constructor overloads for Op so that we can construct instructions that need both a string view and integer field. Blocks, calls and definitions now have a stack effect annotation in the IR for simplifying consumption by a backend. Added instruction_block and instruction_end functions to simplify annotating blocks with their stack effect during code generation. Block numbers are now function local and start from zero instead of being globally numbered like before. Use more consistent naming for library functions and types. Renamed EOF to TERMINATOR to avoid conflicting with the standard macro of the same name.
Configuration menu - View commit details
-
Copy full SHA for 5bf745b - Browse repository at this point
Copy the full SHA 5bf745bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 58d2857 - Browse repository at this point
Copy the full SHA 58d2857View commit details -
Major restructuring of optimisation passes
Function inlining and dead code elimination were previously broken and the code was awkward to work with due to having to try and preserve consistency in the same buffer. Three issues have been addressed in this commit: 1. Functioning inlining now renumbers blocks correctly 2. Dead code elimination now only retains functions with "main" as an ancestor 3. Iterators to the IR are now stable due to the use of a double buffer approach Function inlining was broken previously due to not renumbering blocks after they were inlined. This would mean that multiple calls to the same function which had been inlined woudl result in duplicate blocks which would break the control flow of the program. The new inliner also doesn't count block/def/end/ret instructions. Dead code elimination was previously broken due to preserving functions which were called but not by a common ancestor ("main" in this case). All it would take to preserve a function was to call it _anywhere_ in the program even if the parent function of that call was itself dead. We now use a double buffering like approach to optimisation passes. The original IR is passed in and supposed to remain unchanged while the output IR is supposed to be mutated and will become the next input buffer for the next pass. This gives us some rather nice properties like stability of reference which makes inlining in particular very easy. Constant folding and indirect branch elimination have yet to be moved over to the new architecture but should be fairly easy.
Configuration menu - View commit details
-
Copy full SHA for 280df65 - Browse repository at this point
Copy the full SHA 280df65View commit details -
Update control flow graph generator to use relative blocks
Updated the CFG generator to work with relative block numbers by concatenating the function name to the block ID. Also added weights to the nodes to try and make the generated graphs look a bit nicer.
Configuration menu - View commit details
-
Copy full SHA for d233401 - Browse repository at this point
Copy the full SHA d233401View commit details -
Simplified lexing greatly and avoid hard-coding strings
Using a simpler architecture for the lexer which allows us to specify token strings in a single place and have it work everywhere. This is in contrast to the previous lexer which required updating multiple unrelated pieces of the code in order to change tokens. Switch from shorthand names `cp`, `mv` and `rm` to `copy`, `move` and `remove`. Fixed issue where the IR parser would only accept non-keyword identifiers for user defined functions. This meant functions named "block" for example would result in a parsing error. "copy", "move" and "remove" are now considered proper keywords and as such cannot be used as identifiers. Removed any hard-coding of strings in calls to error functions and instead look up the appropriate string representation of a token instead.
Configuration menu - View commit details
-
Copy full SHA for 48688f3 - Browse repository at this point
Copy the full SHA 48688f3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 37a28dc - Browse repository at this point
Copy the full SHA 37a28dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2684e78 - Browse repository at this point
Copy the full SHA 2684e78View commit details
Commits on Feb 2, 2022
-
Configuration menu - View commit details
-
Copy full SHA for bd65f1a - Browse repository at this point
Copy the full SHA bd65f1aView commit details