-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify closure conversion using ref fields #66680
Comments
I'd add that we need to consider support for EnC/Hot Reload if we are gonna change the closure allocation strategy. One of the most frequent rude edits that get reported during EnC/HR, based on both telemetry and direct customer feedback, is editing code that has lambdas. Specifically, capturing/uncapturing a local variable is a big blocker. I have been thinking about ways to de-optimize the code in DEBUG builds to allow more edits involving lambdas and lifted variables. If we change the strategy and optimize closures to stack-allocated structs we should also consider doing so only in RELEASE builds and deoptimize these to heap-allocated objects in DEBUG builds. The important difference is that the runtime supports adding a field to a class whereas it does not support that for structs. |
Only doing this in release makes a lot of sense. A simple solution would just be to pretend that all local functions are converted to delegates in debug mode. EnC is probably just as bad with the current local function lowering strategy. |
Yes, it is. Just pointing out we need to make it better, not worse :) |
I think this would make it better as the Debug and Release codegen would actually be closer in design, so the likelihood of bugs in one but not the other would be reduced. "Always use classes" is already the codegen for delegate conversion, so there's no more improvement to make in that area. |
How? Part of the requirements of (2) is that we be able to take a delegate to the members. Can't do that with I really think this issue could use code examples to demonstrate what you're trying to accomplish. |
Sorry, imprecise language. We would only use the structure of (2). We would swap between class and ref struct environment based on whether there are any delegate conversions, same as today. |
What is the benefit of this change? One of the negatives is that it will mkae stack frame sizes bigger. |
benefit:
I don't think it will make stack frames bigger. Here's an example of some input code: public void M() {
int i = 0;
{
int j = 0;
Local2();
void Local2()
{
i++;
j++;
}
}
Local1();
void Local1()
{
i++;
}
} And here's the generated code: public class C
{
[StructLayout(LayoutKind.Auto)]
[CompilerGenerated]
private struct <>c__DisplayClass0_0
{
public int i;
}
[StructLayout(LayoutKind.Auto)]
[CompilerGenerated]
private struct <>c__DisplayClass0_1
{
public int j;
}
public void M()
{
<>c__DisplayClass0_0 <>c__DisplayClass0_ = default(<>c__DisplayClass0_0);
<>c__DisplayClass0_.i = 0;
<>c__DisplayClass0_1 <>c__DisplayClass0_2 = default(<>c__DisplayClass0_1);
<>c__DisplayClass0_2.j = 0;
<M>g__Local2|0_1(ref <>c__DisplayClass0_, ref <>c__DisplayClass0_2);
<M>g__Local1|0_0(ref <>c__DisplayClass0_);
}
[CompilerGenerated]
private static void <M>g__Local2|0_1(ref <>c__DisplayClass0_0 P_0, ref <>c__DisplayClass0_1 P_1)
{
P_0.i++;
P_1.j++;
}
[CompilerGenerated]
private static void <M>g__Local1|0_0(ref <>c__DisplayClass0_0 P_0)
{
P_0.i++;
}
} As you can see, we have two struct environments, one for Now look at the generated code if we convert these local functions to lambdas: public class C
{
[CompilerGenerated]
private sealed class <>c__DisplayClass0_0
{
public int i;
internal void <M>b__0()
{
i++;
}
}
[CompilerGenerated]
private sealed class <>c__DisplayClass0_1
{
public int j;
public <>c__DisplayClass0_0 CS$<>8__locals1;
internal void <M>b__1()
{
int i = CS$<>8__locals1.i;
CS$<>8__locals1.i = i + 1;
j++;
}
}
public void M()
{
<>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
<>c__DisplayClass0_.i = 0;
<>c__DisplayClass0_1 <>c__DisplayClass0_2 = new <>c__DisplayClass0_1();
<>c__DisplayClass0_2.CS$<>8__locals1 = <>c__DisplayClass0_;
<>c__DisplayClass0_2.j = 0;
new Action(<>c__DisplayClass0_2.<M>b__1)();
new Action(<>c__DisplayClass0_.<M>b__0)();
}
} Completely different code. These two codegen passes are completely separate and have their own logic. If we could make the codegen for each of these cases more similar, the closure conversion code in the compiler could be a lot simpler. My proposal is that the code for the first example with local functions would look like this: public class C
{
[CompilerGenerated]
private struct <>c__DisplayClass0_0
{
public int i;
internal void <M>b__0()
{
i++;
}
}
[CompilerGenerated]
private ref struct <>c__DisplayClass0_1
{
public int j;
public ref <>c__DisplayClass0_0 CS$<>8__locals1;
internal void <M>b__1()
{
int i = CS$<>8__locals1.i;
CS$<>8__locals1.i = i + 1;
j++;
}
}
public void M()
{
<>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
<>c__DisplayClass0_.i = 0;
<>c__DisplayClass0_1 <>c__DisplayClass0_2 = new <>c__DisplayClass0_1();
<>c__DisplayClass0_2.CS$<>8__locals1 = ref <>c__DisplayClass0_;
<>c__DisplayClass0_2.j = 0;
<>c__DisplayClass0_2.<M>b__1();
<>c__DisplayClass0_.<M>b__0();
}
} Notice how the new code looks almost exactly like the lambda code. This would mean we could delete all the special code around ref argument passing that we had before. |
Okay so you only want the |
Right. |
Another issue I realized: this only works when we're targeting a .NET runtime 7.0 or greater as they're the only ones that support |
For background, closure conversion currently has at least three different emit strategies and locations:
Each of these formats has different code and logic around it, which introduces complexity and compilation time. It would be good if we could reduce complexity as much as possible without harming run time performance.
Ref fields provide an option to consolidate strategies (2) and (3) by making (2) more general.
I'll go through the emit strategies in detail and explain how things can be simplified. The most important consideration for lowering is whether or not the function is converted to a delegate. If the function is converted to a delegate, the closed-over variables may escape the original function, so they need to be placed in a heap-allocated environment. For functions not converted to delegates (local functions), the variables will not escape, so they can be allocated in structs which are declared in the parent function.
The emit strategies reflect the above constraints. We can roughly ignore (1) -- the shared closure environment is only for functions with no captured variables and is special because there are some performance improvements by having a single, fixed closure environment.
When the function is converted to a delegate, we need to use strategy (2). For generating the environment, we take all captured variables and move them a class, move the nested function definition into the class as a method, and rewrite all the captured variable references to field references. Then when the parent function first enters the scope of hoisted variables, we initialize the class and replace the previous variables with a reference to the closure environment. When multiple nested functions capture the same variable, we process from the top down, and subsequent nested functions will capture the closure environment reference instead of the original variables. This basically means that closures which capture variables from multiple environments have a linked list of environments from the most nested environment to the least nested.
When the function is never converted to a delegate we can use strategy (3). This strategy avoids classes, avoiding allocations and hopefully improving run time performance. Instead of lowering the nested function inside a closure environment, it's lowered directly into the parent type. Then, instead of a class environment for captured variables, a struct environment is used. Since the function isn't inside the environment, the environment is passed as a ref parameter to the function. If the function uses variables from multiple environments, those environments are all passed as ref parameters to the function.
With ref fields, I think we can get rid of (3) and only use (2). Without ref fields the problem is that nested functions which capture variables from multiple environments would not be able to store references to those environments inside another environment. Instead, they have to take ref parameters. With ref fields, all closure environments can be structs, all nested functions can be lowered into an environment, and references to other environments can be stored as additional ref fields inside the environment.
The text was updated successfully, but these errors were encountered: