-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-RFC: Making macros and consteval dynamic, composable, and recursive #3785
Comments
Dynamic procedural macrosDue to how the macro system is currently set up, procedural macros are currently implemented as separate crates. This forces users into a static set of procedural macro dependencies, and prevents procedural macros from being composed dynamically. Instead, users have to take some static input and use it to generate a crate, that contains the dynamic output. Dynamic input is not currently possible. For example, the implementation of the Of course, the token stream input provided to eval-macro must be passed in at compile time, and the macro implementation cannot evaluate the provided token stream directly during evaluation. Instead, it has to output a new token stream containing the resulting code to statically generate the output code. For example, it cannot run the statically generated code, then perform additional behavior with the output. It cannot perform any recursive behavior between consteval functions and macros. Right now, all input to a macro function is implicitly quoted, so it cannot be evaluated in the macro implementation at all. Instead of getting the value of the argument the consteval function passes to the macro, you get the symbol that refers to the variable in the context of the macro invocation. You are not given the context of the macro invocation, and the macro implementation runs in its own scope. This means you cannot look up the value of a symbol passed to a macro in the scope of the macro invocation, even if it is a constant known at compile time. You can output that symbol into the output of the macro, but by the time that output gets evaluated, the macro invocation will have long ended. Ideally, we could invoke a macro dynamically from a consteval function. Then, the macro could actually call those consteval functions, and use the output of those calls to generate another macro. This final macro is what users would call. However, the first step of this process it not possible to do today, because a consteval function cannot call a macro that generates additional consteval functions. Recursive MacrosThis change would allow us to generate macros within macros! This could allow us to be much more dynamic with the way we generate/execute macros, because we could execute them in stages based on their dependencies. For example, macro A may invoke macro B and macro C directly (as consteval functions, not as macros), then use the results of that to generate macro D and expose it publicly. It could even be exposed with a dynamically generated name, referencing dynamically generated variables, which improves the hygiene of the procedural macro system. This will require validating that circular dependencies do not exist between compiler stages. Therefore, there will need to be a topological ordering between stages that lets us track which dependencies belong to which stages. Any argument passed to the consteval function implementing the macro can annotated with which compiler stage it belongs to. Once a reference is created that refers to a stage that has not executed before the current stage, we can show a compiler error. This would also let users create their own declarative macro DSLs as a procedural macro. In fact, we likely wouldn't need a distinction between declarative and procedural macros anymore, and the declarative macro could exist in the standard library (in the prelude to preserve backwards compatibility). This would let users create a procedural macro that processes its own DSL input, then generates a new procedural macro with a dynamic name and implementation generated from the user input. Compiler IntegrationIf we can invoke the entire macro/parse/compile/execute loop, we can actually invoke macro/consteval code within other parts of the compiler implementation. This could let users dynamically extend the compiler, such as adding new constructs to the parser or providing a new compiler backend implementation in user space, without needing to recompile Rust itself. This could help improve integration with crates like Rust-GPU, especially when the user has their own specific rustc compiler version they need to use, which may not be supported by Rust-GPU's build infrastructure directly. The VisionWhat if we could live in a world where procedural macros composed just like any other consteval function? We could use this to simplify the macro implementation exposed to users, and even extend the kinds of procedural macros we could support. This is exactly what my proposal is aiming to accomplish. We finally have the technology to remove many limits on macros and consteval limits, especially in terms of composability. As a Rust user, the features I appreciate most are the features that compose well - and this is something that macros currently do not do well. Let's fix it! |
High Level ImplementationIn my opinion, the place to start is the AST. The core of the algorithm divides the AST into different subtrees based on the compilation stage, before any macro processing occurs. The compilation stage would be stored on the item, such as a constant variable or consteval function. The compilation stages are executed in increasing order from zero. The runtime stage is not executed, and is instead compiled to HLIR to be included in the target binary. To determine which stage a specific node in the AST tree belongs to, we need to find the maximum of the compilation stages of all of the items that any node in this subtree references, then add one. To find the top level subtrees to evaluate, collect all of the const items in the AST, such as consteval functions, macros, const expressions, static variables, etc.. All the references in that construct that refer to non-local const items should be considered. For example, a consteval function that only refers to the result of a single const expression should use one plus the compilation stage of the variable that stores that result. This should be equivalent to the compilation stage of the const expression inside the consteval function. The consteval function itself may have a different runtime stage based on what is needed to expand all macros/call all relevant consteval functions. I use the terms "static" and "dynamic" very loosely here, but I will find a better way to express what I mean by that soon. Non-const items always have a compilation stage of runtime. For a subtree which is fully self-contained, and does not reference any lower compilation stages, the compilation stage should be 0. For subtrees which reference values that are only available at runtime, the compilation stage should be Any reference to an item with a higher compilation stage than the stage of the node is invalid. This is to prevent circular dependencies; intuitively, the goal is to validate that two AST expressions do not refer to each other in a way which cannot be executed by the compiler. I don't think this is currently possible today anyways due to existing validations, but it will matter for the kinds of use cases I hope to enable. Once we have the AST split by compilation stages, we just need to iterate through each chunk in topological order, compiling and executing each chunk as we come to it. This ensures that a chunk in compilation stage 1 is executed after a chunk that it references in compilation stage 0. I hope this was enough to give a high level overview of the algorithm. Most of the changes will be in the AST, so we may not need to touch HLIR for this at all. Once the AST is split into chunks, we should be able to follow the rest of the process very similarly to what happens today. This would make it easy to compile and execute MIR JIT style, instead of interpreting it, which is an internal-only change that might be an initial goal to proof out the compiler. I'm not sure if people are actually interested in improving the runtime performance of consteval code though. |
Rust has two important features for compile-time evaluation: macros and consteval functions. The way these two features interact is not bidirectionally composable, which leads to some limitations in the evaluation of macros and consteval functions.
Macro compilation happens in a single pass; that is, all macros in the source code are processed recursively until they are all expanded. Consteval functions run after macro expansion because consteval functions need to be converted to MIR in order to be interpreted. In other words, we currently don't macro expand one part of the AST separate from another part of the same AST. This means you can call macros from consteval functions, but not the other way around. Note that this is different from putting the consteval function into the macro output, because it gets run at a different time. It's also different from calling consteval functions from a procedural macro, because we are instead directly calling the consteval function dynamically at macro evaluation time, like a non-consteval function.
I want to make macros and consteval functions more powerful by eliminating this restriction, so that consteval functions can be called directly from the implementation of a macro. This would look something like a simplified version of MacoCaml's composable/compilable macros. As we currently interpret all consteval code as MIR, this would require changes at the HLIR level and above. In the context of this issue, "compilable macros" means macros which are compilable to MIR. Logically, it shouldn't matter whether the MIR is interpreted, compiled, or some mix of the two.
This will require syntax changes at some point to take advantage of the feature, but for now, I wanted to implement something simple and internal to the compiler to test if the core idea works in the Rust compiler.
The kinds of features this would unlock could be very useful. Here are some ideas:
I have many ideas built on top of this feature, mainly targeted at generating types and functions in consteval functions and macros at compile time. I see this as the foundation to making significantly more powerful macros and consteval functions.
Please leave any early feedback you have about the idea. Thank you!
The text was updated successfully, but these errors were encountered: