Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include as an MLang expression #703

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

johnwikman
Copy link
Contributor

This is the bit of code that was lifted out from PR #694 (MLang AST in MCore). The idea of having an include statement ties in with how include handling will be done in the bootstrapping stage.

Instead of include being a copy-paste of code (a la C style), the included file would be parsed and symbolized separately, and the symbolized identifiers would be added to the scope of where the include is being done. This would slightly impact the include semantics, such that this following program with 2 includes that is currently valid would now be invalid:

-- testA.mc
let strA = "I am string A..."
mexpr ()
-- testB.mc
let strB = concat strA " and I am string B"
mexpr ()
-- test.mc
include "testA.mc"
include "testB.mc"
mexpr
print strB; print "\n"

This currently works with the boot parser since it simply concatenates the includes, whereas in the bootstrapped parser the includes would be parsed independently, and give the error for testB.mc that strA is an unknown variable.

The reason for having an include as an expression would be to control in generated code that the unsymbolized identifiers will refer to the intended functions/types/constructors in some library. E.g. if I want to access the result identifier from result.mc, I could simply do include "result.mc" in at the start of the generated code and not have to worry about where my generated expression is being placed.

Current use case for this kind of feature would be in the code generated for the LR(k) parser (see lrGenerateParser in parser/lrk.mc), where there are unsymbolized to ResultErr, ResultOk, int2string, mergeInfo, join, etc. all over the place. There is currently nothing guaranteeing that these will be symbolized to the intended definitions, instead the generated expressions just assumes that nothing else will bind to these identifiers.

@elegios
Copy link
Contributor

elegios commented Mar 13, 2023

I think there are a couple of requirements and capabilities we should probably separate here:

  • Namespace-handling, what unsymbolized Names should resolve to "by default". This would probably be pretty similar to OCaml's local opens, given the design of use right now.
  • Ensuring that the definitions from another file are loaded/parsed and available.

Both parsed code and generated code (things like parser generators or the utest mechanism, as opposed to code generated in other languages for our backends, e.g., ocaml) need both of these, but it's not certain that they should work by the same mechanism.

For example, should we try to generate code that is already symbolized and/or typechecked? This is potentially error-prone if done manually, but on the other hand we often have a significant amount of information when we write the generating code, so it seems like a waste performance-wise to have to run those passes after we generate such code. We could also mitigate the risk by having optional passes that test correctness of already symbolized/typechecked code (that things are properly in scope and/or types agree).

To make the implementation of such things easier we can probably use some things we'll have to make anyway as a part of bootstrapping. I can imagine that we need to pass around some datastructure that keeps track of files parsed and their definitions. This could be used when we generate code to look up the symbols directly, i.e., generated code needs neither include nor use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants