Simplify closure conversion using ref fields #66680

agocke · 2023-02-02T19:31:46Z

For background, closure conversion currently has at least three different emit strategies and locations:

A single shared closure environment
A closure environment class
A function into the containing type, along with a separate struct environment.

Each of these formats has different code and logic around it, which introduces complexity and compilation time. It would be good if we could reduce complexity as much as possible without harming run time performance.

Ref fields provide an option to consolidate strategies (2) and (3) by making (2) more general.

I'll go through the emit strategies in detail and explain how things can be simplified. The most important consideration for lowering is whether or not the function is converted to a delegate. If the function is converted to a delegate, the closed-over variables may escape the original function, so they need to be placed in a heap-allocated environment. For functions not converted to delegates (local functions), the variables will not escape, so they can be allocated in structs which are declared in the parent function.

The emit strategies reflect the above constraints. We can roughly ignore (1) -- the shared closure environment is only for functions with no captured variables and is special because there are some performance improvements by having a single, fixed closure environment.

When the function is converted to a delegate, we need to use strategy (2). For generating the environment, we take all captured variables and move them a class, move the nested function definition into the class as a method, and rewrite all the captured variable references to field references. Then when the parent function first enters the scope of hoisted variables, we initialize the class and replace the previous variables with a reference to the closure environment. When multiple nested functions capture the same variable, we process from the top down, and subsequent nested functions will capture the closure environment reference instead of the original variables. This basically means that closures which capture variables from multiple environments have a linked list of environments from the most nested environment to the least nested.

When the function is never converted to a delegate we can use strategy (3). This strategy avoids classes, avoiding allocations and hopefully improving run time performance. Instead of lowering the nested function inside a closure environment, it's lowered directly into the parent type. Then, instead of a class environment for captured variables, a struct environment is used. Since the function isn't inside the environment, the environment is passed as a ref parameter to the function. If the function uses variables from multiple environments, those environments are all passed as ref parameters to the function.

With ref fields, I think we can get rid of (3) and only use (2). Without ref fields the problem is that nested functions which capture variables from multiple environments would not be able to store references to those environments inside another environment. Instead, they have to take ref parameters. With ref fields, all closure environments can be structs, all nested functions can be lowered into an environment, and references to other environments can be stored as additional ref fields inside the environment.

tmat · 2023-02-02T20:16:01Z

I'd add that we need to consider support for EnC/Hot Reload if we are gonna change the closure allocation strategy.

One of the most frequent rude edits that get reported during EnC/HR, based on both telemetry and direct customer feedback, is editing code that has lambdas. Specifically, capturing/uncapturing a local variable is a big blocker.

I have been thinking about ways to de-optimize the code in DEBUG builds to allow more edits involving lambdas and lifted variables. If we change the strategy and optimize closures to stack-allocated structs we should also consider doing so only in RELEASE builds and deoptimize these to heap-allocated objects in DEBUG builds. The important difference is that the runtime supports adding a field to a class whereas it does not support that for structs.

agocke · 2023-02-02T20:39:58Z

Only doing this in release makes a lot of sense. A simple solution would just be to pretend that all local functions are converted to delegates in debug mode.

EnC is probably just as bad with the current local function lowering strategy.

tmat · 2023-02-02T20:49:59Z

EnC is probably just as bad with the current local function lowering strategy.

Yes, it is. Just pointing out we need to make it better, not worse :)

agocke · 2023-02-02T21:06:54Z

I think this would make it better as the Debug and Release codegen would actually be closer in design, so the likelihood of bugs in one but not the other would be reduced. "Always use classes" is already the codegen for delegate conversion, so there's no more improvement to make in that area.

jaredpar · 2023-02-14T14:10:44Z

With ref fields, I think we can get rid of (3) and only use (2).

How? Part of the requirements of (2) is that we be able to take a delegate to the members. Can't do that with ref fields cause the containing type must be a ref struct

I really think this issue could use code examples to demonstrate what you're trying to accomplish.

agocke · 2023-02-14T15:59:45Z

Sorry, imprecise language. We would only use the structure of (2). We would swap between class and ref struct environment based on whether there are any delegate conversions, same as today.

jaredpar · 2023-02-14T16:49:07Z

What is the benefit of this change? One of the negatives is that it will mkae stack frame sizes bigger.

agocke · 2023-02-14T19:11:29Z

benefit:

Each of these formats has different code and logic around it, which introduces complexity and compilation time. It would be good if we could reduce complexity as much as possible without harming run time performance.

One of the negatives is that it will mkae stack frame sizes bigger.

I don't think it will make stack frames bigger. Here's an example of some input code:

    public void M() {
        int i = 0;
        {
            int j = 0;
            Local2();
            
            void Local2()
            {
                i++;
                j++;
            }
        }
        Local1();
                
        void Local1()
        {
            i++;
        }
    }

And here's the generated code:

public class C
{
    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct <>c__DisplayClass0_0
    {
        public int i;
    }


    [StructLayout(LayoutKind.Auto)]
    [CompilerGenerated]
    private struct <>c__DisplayClass0_1
    {
        public int j;
    }

    public void M()
    {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = default(<>c__DisplayClass0_0);
        <>c__DisplayClass0_.i = 0;
        <>c__DisplayClass0_1 <>c__DisplayClass0_2 = default(<>c__DisplayClass0_1);
        <>c__DisplayClass0_2.j = 0;
        <M>g__Local2|0_1(ref <>c__DisplayClass0_, ref <>c__DisplayClass0_2);
        <M>g__Local1|0_0(ref <>c__DisplayClass0_);
    }

    [CompilerGenerated]
    private static void <M>g__Local2|0_1(ref <>c__DisplayClass0_0 P_0, ref <>c__DisplayClass0_1 P_1)
    {
        P_0.i++;
        P_1.j++;
    }

    [CompilerGenerated]
    private static void <M>g__Local1|0_0(ref <>c__DisplayClass0_0 P_0)
    {
        P_0.i++;
    }
}

As you can see, we have two struct environments, one for i and one for j. Local1 only uses i, so it only gets one environment passed to it. Local2 uses both i and j, so it gets both environments.

Now look at the generated code if we convert these local functions to lambdas:

public class C
{
    [CompilerGenerated]
    private sealed class <>c__DisplayClass0_0
    {
        public int i;

        internal void <M>b__0()
        {
            i++;
        }
    }


    [CompilerGenerated]
    private sealed class <>c__DisplayClass0_1
    {
        public int j;

        public <>c__DisplayClass0_0 CS$<>8__locals1;

        internal void <M>b__1()
        {
            int i = CS$<>8__locals1.i;
            CS$<>8__locals1.i = i + 1;
            j++;
        }
    }

    public void M()
    {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_.i = 0;
        <>c__DisplayClass0_1 <>c__DisplayClass0_2 = new <>c__DisplayClass0_1();
        <>c__DisplayClass0_2.CS$<>8__locals1 = <>c__DisplayClass0_;
        <>c__DisplayClass0_2.j = 0;
        new Action(<>c__DisplayClass0_2.<M>b__1)();
        new Action(<>c__DisplayClass0_.<M>b__0)();
    }
}

Completely different code. These two codegen passes are completely separate and have their own logic. If we could make the codegen for each of these cases more similar, the closure conversion code in the compiler could be a lot simpler. My proposal is that the code for the first example with local functions would look like this:

public class C
{
    [CompilerGenerated]
    private struct <>c__DisplayClass0_0
    {
        public int i;

        internal void <M>b__0()
        {
            i++;
        }
    }


    [CompilerGenerated]
    private ref struct <>c__DisplayClass0_1
    {
        public int j;

        public ref <>c__DisplayClass0_0 CS$<>8__locals1;

        internal void <M>b__1()
        {
            int i = CS$<>8__locals1.i;
            CS$<>8__locals1.i = i + 1;
            j++;
        }
    }

    public void M()
    {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_.i = 0;
        <>c__DisplayClass0_1 <>c__DisplayClass0_2 = new <>c__DisplayClass0_1();
        <>c__DisplayClass0_2.CS$<>8__locals1 = ref <>c__DisplayClass0_;
        <>c__DisplayClass0_2.j = 0;
        <>c__DisplayClass0_2.<M>b__1();
        <>c__DisplayClass0_.<M>b__0();
    }
}

Notice how the new code looks almost exactly like the lambda code. This would mean we could delete all the special code around ref argument passing that we had before.

jaredpar · 2023-02-14T19:28:45Z

Okay so you only want the ref for the closures not the captured fields?

agocke · 2023-02-14T19:42:47Z

Right.

jaredpar · 2023-02-22T05:04:37Z

Another issue I realized: this only works when we're targeting a .NET runtime 7.0 or greater as they're the only ones that support ref fields. That will limit the impact of this a bit in the short term. Less in the long term.

dotnet-issue-labeler bot added Area-Compilers untriaged Issues and PRs which have not yet been triaged by a lead labels Feb 2, 2023

jcouv added the Code Gen Quality Room for improvement in the quality of the compiler's generated code label Feb 3, 2023

jcouv added this to the C# 12.0 milestone Feb 3, 2023

jaredpar removed the untriaged Issues and PRs which have not yet been triaged by a lead label Feb 22, 2023

jaredpar modified the milestones: C# 12.0, Backlog Feb 22, 2023

This was referenced Mar 15, 2023

EnC doesn't detect local function closure to lambda closure udpate #67323

Open

EnC - Need a way to determine if a class already has a lambda helper class #52759

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify closure conversion using ref fields #66680

Simplify closure conversion using ref fields #66680

agocke commented Feb 2, 2023

tmat commented Feb 2, 2023

agocke commented Feb 2, 2023

tmat commented Feb 2, 2023

agocke commented Feb 2, 2023

jaredpar commented Feb 14, 2023 •

edited

Loading

agocke commented Feb 14, 2023

jaredpar commented Feb 14, 2023

agocke commented Feb 14, 2023 •

edited

Loading

jaredpar commented Feb 14, 2023

agocke commented Feb 14, 2023

jaredpar commented Feb 22, 2023

Simplify closure conversion using ref fields #66680

Simplify closure conversion using ref fields #66680

Comments

agocke commented Feb 2, 2023

tmat commented Feb 2, 2023

agocke commented Feb 2, 2023

tmat commented Feb 2, 2023

agocke commented Feb 2, 2023

jaredpar commented Feb 14, 2023 • edited Loading

agocke commented Feb 14, 2023

jaredpar commented Feb 14, 2023

agocke commented Feb 14, 2023 • edited Loading

jaredpar commented Feb 14, 2023

agocke commented Feb 14, 2023

jaredpar commented Feb 22, 2023

jaredpar commented Feb 14, 2023 •

edited

Loading

agocke commented Feb 14, 2023 •

edited

Loading