RFC: Add an alternative implementation of closures #31253

Keno · 2019-03-05T03:36:04Z

Overview

This PR is a sketch implementation of an alternative mechanism of
implementing closures. It is designed to be complementary to the
existing closure mechanism and makes some different trade offs.
The motivation for this mechanism comes primarily from closure-based
AD tools like Zygote, but I'm expecting it will find other use cases
as well. It's a little hard to name this, because it's just another
way to implement closures. I discussed this with jeff and options
that were considered were "Arrow Closures" or "Non-Nominal Closures",
but for now I'm just calling them "Yet Another Kind of Closure" (YAKC,
pronounced yak-c, rhymes with yahtzee). This PR is in very early
stages, so in order to explain what this is and what this does, I
may describe features that are not yet implemented. See the end of the
commit message to see the current status.

Motivation

Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)

function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->!isa(b, Float64) ? return a : return nothing
end

now, the traditional closure mechanism will lower this to:

struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = !isa(x.b, Float64) ? return x.a : return nothing
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end

the problem with this is apparent: Even though after inference,
we know that a is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this (it is somewhat possible by constructing an object
that redoes the primal analysis, but it's awkward at the very least).
This provides a very straightforward implementation of this feature.

Mechanism

The primary concept introduced by this PR is the YAKC type, defined
as follows:

struct YAKC{A <: Tuple, R}
     env::Any
     ci::CodeInfo
 end

 function (y::YAKC{A, R})(args...) where {A,R}
     typeassert(args, A)
     ccall(:jl_invoke_yakc, Any, (Any, Any), y, args)::R
 end

The dynamic semantics are that calling the yakc will run whatever code
is stored in the .ci object, using env as the self argument. This
is augmented by special support in inference and the optimizer to
co-optimize yakcs that appear in bodies of functions along with their
containing functions in order to enable things like cross-closure DCE.

Note that argument types and return types are explicitly specified,
rather than inferred. The reason for this is to prevent YAKCs from
participating in inference cycles and to allow return-type information
to be provided to inference without having to look at the contained
CodeInfo (because the contents of the CodeInfo are not interprocedurally
valid and may in general depend on what the optimizer was able to
figure out about the program). This is also done with a few to an
extension where the CodeInfo inside the yakc is not generated until
later in the optimization pipeline. It would be possible to optionally
allow return-type inference of the yatc based on the unoptimized
CodeInfo (if available), but that is not currently within the scope
of my planned work in this PR.

Status

The PR has the bare bones support for yakcs, including some initial
support for inling and inference, though that support is known to
be incorrect and incomplete. There are also currently no nice
front-end forms. For explanatory and testing purposes, I would like to
provide a macro of the form:

function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    @yakc ()->!isa(b, Float64) ? return a : return nothing
end

but that is not implemented yet. At the moment, the codeinfo needs
to be spliced in manually (here we're abusing code_lowered to construct
us a CodeInfo of the appropriate form), e.g.:

julia> bar() = 1
bar (generic function with 1 method)

julia> ci = @code_lowered bar();

julia> @eval function foo1()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci))
       end
foo1 (generic function with 1 method)

julia> @eval function foo2()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci))
          f()
       end
foo2 (generic function with 1 method)

julia> foo1()
Core.YAKC{Tuple{},Int64}(nothing, CodeInfo(
1 ─     return 1
))

julia> foo1()()
1

julia> @code_typed foo2()
CodeInfo(
1 ─     return 1
) => Int64

julia> struct Test
           a::Int
           b::Int
       end

julia> (a::Test)() = getfield(a, 1) + getfield(a, 2)

julia> ci2 = @code_lowered Test(1, 2)();

julia> @eval function foo2()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), (1, 2), ci2))
          f()
       end
foo2 (generic function with 1 method)

julia> @code_typed foo2()
CodeInfo(
1 ─     return 3
) => Int64

TODO:

Show that this actually helps the Zygote use case I care about
Frontend support
Better optimizations (detection of need for recursive inlining)
Codegen for yakcs (right now they're interpreted)

base/compiler/inferencestate.jl

Keno · 2019-03-06T21:35:22Z

I've tried this out with Zygote and there is a lot of things I like about it. The biggest problem I see at this point and (as I found out) in its current form doesn't work too well to solve the nested AD problem (for nested AD, you'd generally want to co-optimize and in this proposal there isn't a great place to stash that CodeInfo). I think it is quite feasible to modify this to split out an object that encapsulates the set of functions that need to be co-optimized (jl_yakc_herd_t?) which would then be the natural place to cache the transformed versions (and e.g. for the second order derivative, you'd co-optimize all the various codeinfos up to second order). For our own reference attaching the state of the white board at the end of the discussion about this:

I'll need to let this stew for a little while longer.

Keno · 2020-09-14T02:00:48Z

in its current form doesn't work too well to solve the nested AD problem

The nested AD problem is resolved by switching the representation.

I'm planning to revive this branch, so if there are comments, let me know.

Keno · 2020-09-17T21:42:10Z

Capturing some discussion with @vtjnash and @JeffBezanson:

The question we were discussing is what to do with inferred return types for YAKC.
Right now, the way I have it set up is that YAKC has type parameters YAKC{A,R}
for the YAKC's argument and return types and the optimizer is allowed to adjust the
R parameter based on inference results. Inference dependent program behavior
(like return_type) is a general cause for concern, so we discussed various options.
In the end we arrived at the following:

Rather than punning on Expr(:new), there will be a dedicated yakc constructor yakc(env, lb, ub, ci) where the implementation (interpreter, optimizer, etc.) is allowed to chose the return type between lb and ub according to whaterver rules it likes.
YAKC will have a custom lattice element that has full provenance information and can use the CodeInfo to do inference at the proper call site. It will not use the type bounds for this.
During the same inference session, we may not lift the inference result into the type domain. This is because the optimizer is what chooses the ultimate representation and optimization is logically (and depending on when the choice happens, actually), sequenced after inference, so inference may not try to use such information.
It is nevertheless ok to specialize on the type parameter afterwards if the yakc escapes from its optimization context (but inference may no longer look at the CodeInfo, since it lost provenance information at that point).

Keno · 2020-09-24T19:36:27Z

One remaining question is what to call this thing. Some options:

delayed closure
opaque closure
thunk
optic closure
dynamic closure
arrow closure
nnclosure
Just keep yakc

oxinabox · 2020-09-24T21:31:41Z

opaque closure

I think this one.

I guess users would creat these via @opaque ()->x+y, or similar?

A two birds with one stone option might be to make them also disallow rebinding variables so as to make boxing impossible.
Then could take FastClosure.jl's @closure macro, since it would probably take over all uses of FastClosure.jl

StefanKarpinski · 2020-09-29T18:31:10Z

Calling it a closure when it captures variables by value would be rather perverse: the defining characteristic of a closure is that it closes over bindings. It should be called @nonclosure in that case.

Keno · 2020-09-29T18:56:07Z

Capture vs non-capture is a frontend decision. I was planning to use the same lowering, meaning that bindings are captured by references and Boxes are introduced as they are now. The difference is that the optimizer would be allowed to optimize it back to a value capture if it can prove the dynamic semantics are unnecessary (since the opaqueness of the environment guarantees that nobody else is allowed to look at the box).

Keno · 2020-09-30T04:34:23Z

Alright, please take a look at test/yakc.jl and see if everyone likes the semantics. Frontend support is still missing, but @JeffBezanson said he'd do that for me and I since I don't need it for my use cases, I'll leave that out from this PR for the moment.

Obviously the implementation is primitive right now, lacks some of the corner cases, and most optimizations. However, it does pass the test/yakc.jl test, which I think covers (basic versions of) all the major different features I want it to have, so I'm reasonably confident that this is reasonable, so I'd like to get this on the path to get merged, and then iterate on it on master, as I try to use it for my actual application.

So do we like opaque closure then? If so, I'll do that rename tomorrow.

JeffreySarnoff · 2020-09-30T05:53:14Z

Naming this YAKC would help the reader avoid misunderstanding by clarifying their unknowing. Another approach to helping the reader would name this opaque closure type OpaqueClosure.

StefanKarpinski · 2020-09-30T12:57:08Z

Opaque closure seems good to me.

tpapp · 2020-09-30T13:14:04Z

DarkClosure would be shorter, and kind of cool.

StefanKarpinski · 2020-09-30T14:48:53Z

Or delayed closure; less cool but more accurate, imo.

JeffreySarnoff · 2020-09-30T15:37:07Z

delayed closure is better, good suggestion
for the cool kids: now delay is performant

oxinabox · 2020-09-30T15:47:13Z

Delayed Closure feels like it is exposing an implementation detail.

The opaqueness is the user facing difference.
Isn't it?

JeffreySarnoff · 2020-09-30T18:55:59Z

everyone is correct -- what to do, what to do :)

JeffBezanson · 2020-09-30T18:59:34Z

I like OpaqueClosure as well; the type hides what is called (and is intended to hide the captured variables to the extent possible), which is a key difference with our current closures.

JeffreySarnoff · 2020-09-30T19:03:21Z

great!

Keno · 2020-09-30T20:13:21Z

Alright, I feel like "opaque closures" have it then.

JeffBezanson · 2020-09-30T21:48:26Z

base/compiler/abstractinterpretation.jl

@@ -773,6 +773,24 @@ function argtype_tail(argtypes::Vector{Any}, i::Int)
    return argtypes[i:n]
 end

+function _yakc_tfunc(@nospecialize(arg), @nospecialize(lb), @nospecialize(ub),


Move to tfuncs.jl?

JeffBezanson · 2020-09-30T21:50:59Z

base/compiler/optimize.jl

+        # Give calls to YAKCs zero cost. The plan is for these to be a single
+        # indirect call so have very little cost. On the other hand, there
+        # is enormous benefit to inlining these into a function where we can
+        # see the definition of the YAKC. Perhaps this should even be negative


Do you want to encourage inlining a yakc into its caller, or the function containing the call into its caller? (perhaps both?)

Both, e.g. consider:

call(f::YAKC) = f() bar() = call(_yakc(...))

We would like to elide the allocation of the yakc and just inline the whatever the code the yakc would have had right into the caller, but in order to be able to do that, we need to see both the call and the definition to the yakc, so a high inline penalty for calling a yakc would be counterproductive.

JeffBezanson · 2020-09-30T21:54:43Z

base/compiler/ssair/inlining.jl

+        elseif isa(calltype.ci, Method)
+            m = calltype.ci
+            stmt.args[5] = m
+            stmt.args[3] = stmt.args[4] = widenconst(tmerge(lb, m.unspecialized.cache.rettype))


Is m.unspecialized guaranteed to be defined here? (Maybe I just need to keep reading)

JeffBezanson · 2020-09-30T21:56:52Z

base/compiler/ssair/legacy.jl

@@ -25,6 +26,8 @@ function inflate_ir(ci::CodeInfo, sptypes::Vector{Any}, argtypes::Vector{Any})
        elseif isa(stmt, Expr) && stmt.head === :enter
            stmt.args[1] = block_for_inst(cfg, stmt.args[1])
            code[i] = stmt
+        elseif isa(stmt, Expr) && stmt.head == :new
+            code[i] = stmt


rebase artifact

JeffBezanson · 2020-10-01T00:58:04Z

src/datatype.c

@@ -930,8 +930,21 @@ static void init_struct_tail(jl_datatype_t *type, jl_value_t *jv, size_t na)
 JL_DLLEXPORT jl_value_t *jl_new_structv(jl_datatype_t *type, jl_value_t **args, uint32_t na)
 {
    jl_ptls_t ptls = jl_get_ptls_states();
-    if (!jl_is_datatype(type) || type->layout == NULL)
+    if (!jl_is_datatype(type) || type->layout == NULL) {
+        // As a special case we're allowed to have unionalls over yakc,


Can we deal with this elsewhere? It would be better if new only has to worry about allocating the type passed to it, instead of also determining what type to allocate.

Yes, sorry. This is outdated and can be removed.

@yakc

# Overview This PR is a sketch implementation of an alternative mechanism of implementing closures. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools like Zygote, but I'm expecting it will find other use cases as well. It's a little hard to name this, because it's just another way to implement closures. I discussed this with jeff and options that were considered were "Arrow Closures" or "Non-Nominal Closures", but for now I'm just calling them "Yet Another Kind of Closure" (YAKC, pronounced yak-c, rhymes with yahtzee). This PR is in very early stages, so in order to explain what this is and what this does, I may describe features that are not yet implemented. See the end of the commit message to see the current status. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return a : return nothing end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(b, Float64) ? return a : return nothing function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though after inference, we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this (it is somewhat possible by constructing an object that redoes the primal analysis, but it's awkward at the very least). This provides a very straightforward implementation of this feature. # Mechanism The primary concept introduced by this PR is the `YAKC` type, defined as follows: ``` struct YAKC{A <: Tuple, R} env::Any ci::CodeInfo end function (y::YAKC{A, R})(args...) where {A,R} typeassert(args, A) ccall(:jl_invoke_yakc, Any, (Any, Any), y, args)::R end ``` The dynamic semantics are that calling the yakc will run whatever code is stored in the `.ci` object, using `env` as the self argument. This is augmented by special support in inference and the optimizer to co-optimize yakcs that appear in bodies of functions along with their containing functions in order to enable things like cross-closure DCE. Note that argument types and return types are explicitly specified, rather than inferred. The reason for this is to prevent YAKCs from participating in inference cycles and to allow return-type information to be provided to inference without having to look at the contained CodeInfo (because the contents of the CodeInfo are not interprocedurally valid and may in general depend on what the optimizer was able to figure out about the program). This is also done with a few to an extension where the CodeInfo inside the yakc is not generated until later in the optimization pipeline. It would be possible to optionally allow return-type inference of the yatc based on the unoptimized CodeInfo (if available), but that is not currently within the scope of my planned work in this PR. # Status The PR has the bare bones support for yakcs, including some initial support for inling and inference, though that support is known to be incorrect and incomplete. There are also currently no nice front-end forms. For explanatory and testing purposes, I would like to provide a macro of the form: ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 @yakc ()->isa(b, Float64) ? return a : return nothing end ``` but that is not implemented yet. At the moment, the codeinfo needs to be spliced in manually (here we're abusing `code_lowered` to construct us a CodeInfo of the appropriate form), e.g.: ``` julia> bar() = 1 bar (generic function with 1 method) julia> ci = @code_lowered bar(); julia> @eval function foo1() f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci)) end foo1 (generic function with 1 method) julia> @eval function foo2() f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci)) f() end foo2 (generic function with 1 method) julia> foo1() Core.YAKC{Tuple{},Int64}(nothing, CodeInfo( 1 ─ return 1 )) julia> foo1()() 1 julia> @code_typed foo2() CodeInfo( 1 ─ return 1 ) => Int64 julia> struct Test a::Int b::Int end julia> (a::Test)() = getfield(a, 1) + getfield(a, 2) julia> ci2 = @code_lowered Test(1, 2)(); julia> @eval function foo2() f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), (1, 2), ci2)) f() end foo2 (generic function with 1 method) julia> @code_typed foo2() CodeInfo( 1 ─ return 3 ) => Int64 ``` TODO: - [ ] Show that this actually helps the Zygote use case I care about - [ ] Frontend support - [ ] Better optimizations (detection of need for recursive inlining) - [ ] Codegen for yakcs (right now they're interpreted)

Keno · 2020-10-02T01:28:54Z

I've done the rename. I'll push it up to a new PR, so that we can have a clean place to do any review and get it merged, while keeping this history here.

@opaque

This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```

@opaque

This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```

@opaque

This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```

@opaque

This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```

@opaque

This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```

Keno requested review from vtjnash, JeffBezanson and jrevels March 5, 2019 03:36

ararslan reviewed Mar 5, 2019

View reviewed changes

base/compiler/inferencestate.jl Outdated Show resolved Hide resolved

Keno mentioned this pull request Mar 5, 2019

Improve _apply(apply_type, ::Tuple, ::SimpleVector) #29922

Merged

Keno mentioned this pull request Mar 28, 2019

Compiled frames: a sketch JuliaDebug/JuliaInterpreter.jl#204

Open

Keno force-pushed the kf/yakc branch from 6d4db5f to 9951d33 Compare March 28, 2019 23:29

MikeInnes mentioned this pull request Sep 16, 2019

Lambdas FluxML/IRTools.jl#27

Merged

c42f mentioned this pull request May 15, 2020

performance of captured variables in closures #15276

Open

Keno force-pushed the kf/yakc branch from 9951d33 to bcfcbf8 Compare September 14, 2020 20:35

Keno force-pushed the kf/yakc branch from bcfcbf8 to 176534b Compare September 21, 2020 01:17

Keno changed the title ~~Very WIP: RFC: Add an alternative implementation of closures~~ WIP: RFC: Add an alternative implementation of closures Sep 21, 2020

Keno force-pushed the kf/yakc branch 3 times, most recently from 97da538 to 3083fcb Compare September 22, 2020 00:20

Keno force-pushed the kf/yakc branch from 3083fcb to 9e75435 Compare September 30, 2020 04:28

Keno changed the title ~~WIP: RFC: Add an alternative implementation of closures~~ RFC: Add an alternative implementation of closures Sep 30, 2020

JeffBezanson reviewed Sep 30, 2020

View reviewed changes

JeffBezanson reviewed Oct 1, 2020

View reviewed changes

Keno force-pushed the kf/yakc branch from 9e75435 to 350e09a Compare October 1, 2020 05:36

Keno closed this Oct 2, 2020

Keno mentioned this pull request Oct 2, 2020

Implement opaque closures #37849

Closed

DilumAluthge deleted the kf/yakc branch March 25, 2021 21:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Add an alternative implementation of closures #31253

RFC: Add an alternative implementation of closures #31253

Keno commented Mar 5, 2019 •

edited

Loading

Keno commented Mar 6, 2019

Keno commented Sep 14, 2020

Keno commented Sep 17, 2020

Keno commented Sep 24, 2020 •

edited

Loading

oxinabox commented Sep 24, 2020

StefanKarpinski commented Sep 29, 2020

Keno commented Sep 29, 2020

Keno commented Sep 30, 2020

JeffreySarnoff commented Sep 30, 2020

StefanKarpinski commented Sep 30, 2020

tpapp commented Sep 30, 2020

StefanKarpinski commented Sep 30, 2020

JeffreySarnoff commented Sep 30, 2020

oxinabox commented Sep 30, 2020 •

edited

Loading

JeffreySarnoff commented Sep 30, 2020

JeffBezanson commented Sep 30, 2020 •

edited

Loading

JeffreySarnoff commented Sep 30, 2020

Keno commented Sep 30, 2020

JeffBezanson Sep 30, 2020

JeffBezanson Sep 30, 2020

Keno Sep 30, 2020

JeffBezanson Sep 30, 2020

JeffBezanson Sep 30, 2020

Keno Sep 30, 2020

JeffBezanson Oct 1, 2020

Keno Oct 1, 2020

Keno commented Oct 2, 2020

RFC: Add an alternative implementation of closures #31253

RFC: Add an alternative implementation of closures #31253

Conversation

Keno commented Mar 5, 2019 • edited Loading

Overview

Motivation

Optimization across closure boundaries

Closures from Casette transforms

Mechanism

Status

Keno commented Mar 6, 2019

Keno commented Sep 14, 2020

Keno commented Sep 17, 2020

Keno commented Sep 24, 2020 • edited Loading

oxinabox commented Sep 24, 2020

StefanKarpinski commented Sep 29, 2020

Keno commented Sep 29, 2020

Keno commented Sep 30, 2020

JeffreySarnoff commented Sep 30, 2020

StefanKarpinski commented Sep 30, 2020

tpapp commented Sep 30, 2020

StefanKarpinski commented Sep 30, 2020

JeffreySarnoff commented Sep 30, 2020

oxinabox commented Sep 30, 2020 • edited Loading

JeffreySarnoff commented Sep 30, 2020

JeffBezanson commented Sep 30, 2020 • edited Loading

JeffreySarnoff commented Sep 30, 2020

Keno commented Sep 30, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Keno commented Oct 2, 2020

Keno commented Mar 5, 2019 •

edited

Loading

Keno commented Sep 24, 2020 •

edited

Loading

oxinabox commented Sep 30, 2020 •

edited

Loading

JeffBezanson commented Sep 30, 2020 •

edited

Loading