-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add an alternative implementation of closures #31253
Conversation
I've tried this out with Zygote and there is a lot of things I like about it. The biggest problem I see at this point and (as I found out) in its current form doesn't work too well to solve the nested AD problem (for nested AD, you'd generally want to co-optimize and in this proposal there isn't a great place to stash that CodeInfo). I think it is quite feasible to modify this to split out an object that encapsulates the set of functions that need to be co-optimized ( I'll need to let this stew for a little while longer. |
The nested AD problem is resolved by switching the representation. I'm planning to revive this branch, so if there are comments, let me know. |
Capturing some discussion with @vtjnash and @JeffBezanson: The question we were discussing is what to do with inferred return types for YAKC.
|
97da538
to
3083fcb
Compare
One remaining question is what to call this thing. Some options:
|
I think this one. I guess users would creat these via A two birds with one stone option might be to make them also disallow rebinding variables so as to make boxing impossible. |
Calling it a closure when it captures variables by value would be rather perverse: the defining characteristic of a closure is that it closes over bindings. It should be called @nonclosure in that case. |
Capture vs non-capture is a frontend decision. I was planning to use the same lowering, meaning that bindings are captured by references and |
Alright, please take a look at test/yakc.jl and see if everyone likes the semantics. Frontend support is still missing, but @JeffBezanson said he'd do that for me and I since I don't need it for my use cases, I'll leave that out from this PR for the moment. Obviously the implementation is primitive right now, lacks some of the corner cases, and most optimizations. However, it does pass the So do we like |
Naming this YAKC would help the reader avoid misunderstanding by clarifying their unknowing. Another approach to helping the reader would name this opaque closure type OpaqueClosure. |
Opaque closure seems good to me. |
DarkClosure would be shorter, and kind of cool. |
Or delayed closure; less cool but more accurate, imo. |
delayed closure is better, good suggestion |
Delayed Closure feels like it is exposing an implementation detail. The opaqueness is the user facing difference. |
everyone is correct -- what to do, what to do :) |
I like OpaqueClosure as well; the type hides what is called (and is intended to hide the captured variables to the extent possible), which is a key difference with our current closures. |
great! |
Alright, I feel like "opaque closures" have it then. |
@@ -773,6 +773,24 @@ function argtype_tail(argtypes::Vector{Any}, i::Int) | |||
return argtypes[i:n] | |||
end | |||
|
|||
function _yakc_tfunc(@nospecialize(arg), @nospecialize(lb), @nospecialize(ub), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move to tfuncs.jl?
# Give calls to YAKCs zero cost. The plan is for these to be a single | ||
# indirect call so have very little cost. On the other hand, there | ||
# is enormous benefit to inlining these into a function where we can | ||
# see the definition of the YAKC. Perhaps this should even be negative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to encourage inlining a yakc into its caller, or the function containing the call into its caller? (perhaps both?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both, e.g. consider:
call(f::YAKC) = f()
bar() = call(_yakc(...))
We would like to elide the allocation of the yakc and just inline the whatever the code the yakc would have had right into the caller, but in order to be able to do that, we need to see both the call and the definition to the yakc, so a high inline penalty for calling a yakc would be counterproductive.
elseif isa(calltype.ci, Method) | ||
m = calltype.ci | ||
stmt.args[5] = m | ||
stmt.args[3] = stmt.args[4] = widenconst(tmerge(lb, m.unspecialized.cache.rettype)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is m.unspecialized
guaranteed to be defined here? (Maybe I just need to keep reading)
@@ -25,6 +26,8 @@ function inflate_ir(ci::CodeInfo, sptypes::Vector{Any}, argtypes::Vector{Any}) | |||
elseif isa(stmt, Expr) && stmt.head === :enter | |||
stmt.args[1] = block_for_inst(cfg, stmt.args[1]) | |||
code[i] = stmt | |||
elseif isa(stmt, Expr) && stmt.head == :new | |||
code[i] = stmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebase artifact
src/datatype.c
Outdated
@@ -930,8 +930,21 @@ static void init_struct_tail(jl_datatype_t *type, jl_value_t *jv, size_t na) | |||
JL_DLLEXPORT jl_value_t *jl_new_structv(jl_datatype_t *type, jl_value_t **args, uint32_t na) | |||
{ | |||
jl_ptls_t ptls = jl_get_ptls_states(); | |||
if (!jl_is_datatype(type) || type->layout == NULL) | |||
if (!jl_is_datatype(type) || type->layout == NULL) { | |||
// As a special case we're allowed to have unionalls over yakc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we deal with this elsewhere? It would be better if new
only has to worry about allocating the type passed to it, instead of also determining what type to allocate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry. This is outdated and can be removed.
# Overview This PR is a sketch implementation of an alternative mechanism of implementing closures. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools like Zygote, but I'm expecting it will find other use cases as well. It's a little hard to name this, because it's just another way to implement closures. I discussed this with jeff and options that were considered were "Arrow Closures" or "Non-Nominal Closures", but for now I'm just calling them "Yet Another Kind of Closure" (YAKC, pronounced yak-c, rhymes with yahtzee). This PR is in very early stages, so in order to explain what this is and what this does, I may describe features that are not yet implemented. See the end of the commit message to see the current status. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return a : return nothing end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(b, Float64) ? return a : return nothing function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though after inference, we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this (it is somewhat possible by constructing an object that redoes the primal analysis, but it's awkward at the very least). This provides a very straightforward implementation of this feature. # Mechanism The primary concept introduced by this PR is the `YAKC` type, defined as follows: ``` struct YAKC{A <: Tuple, R} env::Any ci::CodeInfo end function (y::YAKC{A, R})(args...) where {A,R} typeassert(args, A) ccall(:jl_invoke_yakc, Any, (Any, Any), y, args)::R end ``` The dynamic semantics are that calling the yakc will run whatever code is stored in the `.ci` object, using `env` as the self argument. This is augmented by special support in inference and the optimizer to co-optimize yakcs that appear in bodies of functions along with their containing functions in order to enable things like cross-closure DCE. Note that argument types and return types are explicitly specified, rather than inferred. The reason for this is to prevent YAKCs from participating in inference cycles and to allow return-type information to be provided to inference without having to look at the contained CodeInfo (because the contents of the CodeInfo are not interprocedurally valid and may in general depend on what the optimizer was able to figure out about the program). This is also done with a few to an extension where the CodeInfo inside the yakc is not generated until later in the optimization pipeline. It would be possible to optionally allow return-type inference of the yatc based on the unoptimized CodeInfo (if available), but that is not currently within the scope of my planned work in this PR. # Status The PR has the bare bones support for yakcs, including some initial support for inling and inference, though that support is known to be incorrect and incomplete. There are also currently no nice front-end forms. For explanatory and testing purposes, I would like to provide a macro of the form: ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 @yakc ()->isa(b, Float64) ? return a : return nothing end ``` but that is not implemented yet. At the moment, the codeinfo needs to be spliced in manually (here we're abusing `code_lowered` to construct us a CodeInfo of the appropriate form), e.g.: ``` julia> bar() = 1 bar (generic function with 1 method) julia> ci = @code_lowered bar(); julia> @eval function foo1() f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci)) end foo1 (generic function with 1 method) julia> @eval function foo2() f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci)) f() end foo2 (generic function with 1 method) julia> foo1() Core.YAKC{Tuple{},Int64}(nothing, CodeInfo( 1 ─ return 1 )) julia> foo1()() 1 julia> @code_typed foo2() CodeInfo( 1 ─ return 1 ) => Int64 julia> struct Test a::Int b::Int end julia> (a::Test)() = getfield(a, 1) + getfield(a, 2) julia> ci2 = @code_lowered Test(1, 2)(); julia> @eval function foo2() f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), (1, 2), ci2)) f() end foo2 (generic function with 1 method) julia> @code_typed foo2() CodeInfo( 1 ─ return 3 ) => Int64 ``` TODO: - [ ] Show that this actually helps the Zygote use case I care about - [ ] Frontend support - [ ] Better optimizations (detection of need for recursive inlining) - [ ] Codegen for yakcs (right now they're interpreted)
I've done the rename. I'll push it up to a new PR, so that we can have a clean place to do any review and get it merged, while keeping this history here. |
This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```
This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```
This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```
This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```
This is the end result of the design process in #31253. # Overview This PR implements a new kind of closure, called an `opaque closure`. It is designed to be complimenatry to the existing closure mechanism and makes some different trade offs. The motivation for this mechanism comes primarily from closure-based AD tools, but I'm expecting it will find other use cases as well. From the end user perspective, opaque closures basically behave like regular closures, expect that they are introduced by adding the `@opaque` macro (not part of this PR, but will be added after). In front of the existing closure. In particular, all scoping, capture, etc. rules are identical. For such user written closures, the primary difference is in the performance characteristics. In particular: 1) Passing an opaque closure to a high order function will specialize on the argument and return types of the closure, but not on the closure identity. (This also means that the opaque closure will not be eligible for inlining into the higher order function, unless the inliner can see both the definition and the call site). 2) The optimizer is allowed to modify the capture environment of the opaque closure (e.g. dropping unused captures, or reducing `Box`ed values back to value captures). The `opaque` part of the naming comes from the notion that semantically, nothing is supposed to inspect either the code or the capture environment of the opaque closure, since the optimizer is allowed to choose any value for these that preserves the behavior of calling the opaque closure itself. # Motivation ## Optimization across closure boundaries Consider the following situation (type annotations are inference results, not type asserts) ``` function foo() a = expensive_but_effect_free()::Any b = something()::Float64 ()->isa(b, Float64) ? return nothing : return a end ``` now, the traditional closure mechanism will lower this to: ``` struct ###{T, S} a::T b::S end (x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a function foo() a = expensive_but_effect_free()::Any b = something()::Float64 new(a, b) end ``` the problem with this is apparent: Even though (after inference), we know that `a` is unused in the closure (and thus would be able to delete the expensive call were it not for the capture), we may not delete it, simply because we need to satisfy the full capture list of the closure. Ideally, we would like to have a mechanism where the optimizer may modify the capture list of a closure in response to information it discovers. ## Closures from Casette transforms Compiler passes like Zygote would like to generate new closures from untyped IR (i.e. after the frontend runs) (and in the future potentially typed IR also). We currently do not have a great mechanism to support this. This provides a very straightforward implementation of this feature, as opaque closures may be inserted at any point during the compilation process (unlike types, which may only be inserted by the frontend). # Mechanism The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}` type, constructed, by the new `Core._opaque_closure` builtin, with the following signature: ``` _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...) Create a new OpaqueClosure taking arguments specified by the types `argt`. When called, this opaque closure will execute the source specified in `source`. The `lb` and `ub` arguments constrain the return type of the opaque closure. In particular, any return value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If the optimizer runs, it may replace `R` by the narrowest possible type inference was able to determine. To guarantee a particular value of `R`, set lb===ub. ``` Captures are available to the CodeInfo as `getfield` from Slot 1 (referenced by position). # Examples I think the easiest way to understand opaque closures is look through a few examples. These make use of the `@opaque` macro which isn't implemented yet, but makes understanding the semantics easier. Some of these examples, in currently available syntax can be seen in test/opaque_closure.jl ``` oc_trivial() = @opaque ()::Any->1 @show oc_trivial() # ()::Any->◌ @show oc_trivial()() # 1 oc_inf() = @opaque ()->1 # Int return type is inferred @show oc_inf() # ()::Int->◌ @show oc_inf()() # 1 function local_call(b::Int) f = @opaque (a::Int)->a + b f(2) end oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B @show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌ @show sizeof(oc_capture_opt([1; 2]).env) # 0 @show oc_capture_opt([1 2])([3 4]) # [6 8] ```
Overview
This PR is a sketch implementation of an alternative mechanism of
implementing closures. It is designed to be complementary to the
existing closure mechanism and makes some different trade offs.
The motivation for this mechanism comes primarily from closure-based
AD tools like Zygote, but I'm expecting it will find other use cases
as well. It's a little hard to name this, because it's just another
way to implement closures. I discussed this with jeff and options
that were considered were "Arrow Closures" or "Non-Nominal Closures",
but for now I'm just calling them "Yet Another Kind of Closure" (YAKC,
pronounced yak-c, rhymes with yahtzee). This PR is in very early
stages, so in order to explain what this is and what this does, I
may describe features that are not yet implemented. See the end of the
commit message to see the current status.
Motivation
Optimization across closure boundaries
Consider the following situation (type annotations are inference
results, not type asserts)
now, the traditional closure mechanism will lower this to:
the problem with this is apparent: Even though after inference,
we know that
a
is unused in the closure (and thus would beable to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.
Closures from Casette transforms
Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this (it is somewhat possible by constructing an object
that redoes the primal analysis, but it's awkward at the very least).
This provides a very straightforward implementation of this feature.
Mechanism
The primary concept introduced by this PR is the
YAKC
type, definedas follows:
The dynamic semantics are that calling the yakc will run whatever code
is stored in the
.ci
object, usingenv
as the self argument. Thisis augmented by special support in inference and the optimizer to
co-optimize yakcs that appear in bodies of functions along with their
containing functions in order to enable things like cross-closure DCE.
Note that argument types and return types are explicitly specified,
rather than inferred. The reason for this is to prevent YAKCs from
participating in inference cycles and to allow return-type information
to be provided to inference without having to look at the contained
CodeInfo (because the contents of the CodeInfo are not interprocedurally
valid and may in general depend on what the optimizer was able to
figure out about the program). This is also done with a few to an
extension where the CodeInfo inside the yakc is not generated until
later in the optimization pipeline. It would be possible to optionally
allow return-type inference of the yatc based on the unoptimized
CodeInfo (if available), but that is not currently within the scope
of my planned work in this PR.
Status
The PR has the bare bones support for yakcs, including some initial
support for inling and inference, though that support is known to
be incorrect and incomplete. There are also currently no nice
front-end forms. For explanatory and testing purposes, I would like to
provide a macro of the form:
but that is not implemented yet. At the moment, the codeinfo needs
to be spliced in manually (here we're abusing
code_lowered
to constructus a CodeInfo of the appropriate form), e.g.:
TODO: