Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add an alternative implementation of closures #31253

Closed
wants to merge 1 commit into from
Closed

Conversation

Keno
Copy link
Member

@Keno Keno commented Mar 5, 2019

Overview

This PR is a sketch implementation of an alternative mechanism of
implementing closures. It is designed to be complementary to the
existing closure mechanism and makes some different trade offs.
The motivation for this mechanism comes primarily from closure-based
AD tools like Zygote, but I'm expecting it will find other use cases
as well. It's a little hard to name this, because it's just another
way to implement closures. I discussed this with jeff and options
that were considered were "Arrow Closures" or "Non-Nominal Closures",
but for now I'm just calling them "Yet Another Kind of Closure" (YAKC,
pronounced yak-c, rhymes with yahtzee). This PR is in very early
stages, so in order to explain what this is and what this does, I
may describe features that are not yet implemented. See the end of the
commit message to see the current status.

Motivation

Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)

function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->!isa(b, Float64) ? return a : return nothing
end

now, the traditional closure mechanism will lower this to:

struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = !isa(x.b, Float64) ? return x.a : return nothing
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end

the problem with this is apparent: Even though after inference,
we know that a is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this (it is somewhat possible by constructing an object
that redoes the primal analysis, but it's awkward at the very least).
This provides a very straightforward implementation of this feature.

Mechanism

The primary concept introduced by this PR is the YAKC type, defined
as follows:

struct YAKC{A <: Tuple, R}
     env::Any
     ci::CodeInfo
 end

 function (y::YAKC{A, R})(args...) where {A,R}
     typeassert(args, A)
     ccall(:jl_invoke_yakc, Any, (Any, Any), y, args)::R
 end

The dynamic semantics are that calling the yakc will run whatever code
is stored in the .ci object, using env as the self argument. This
is augmented by special support in inference and the optimizer to
co-optimize yakcs that appear in bodies of functions along with their
containing functions in order to enable things like cross-closure DCE.

Note that argument types and return types are explicitly specified,
rather than inferred. The reason for this is to prevent YAKCs from
participating in inference cycles and to allow return-type information
to be provided to inference without having to look at the contained
CodeInfo (because the contents of the CodeInfo are not interprocedurally
valid and may in general depend on what the optimizer was able to
figure out about the program). This is also done with a few to an
extension where the CodeInfo inside the yakc is not generated until
later in the optimization pipeline. It would be possible to optionally
allow return-type inference of the yatc based on the unoptimized
CodeInfo (if available), but that is not currently within the scope
of my planned work in this PR.

Status

The PR has the bare bones support for yakcs, including some initial
support for inling and inference, though that support is known to
be incorrect and incomplete. There are also currently no nice
front-end forms. For explanatory and testing purposes, I would like to
provide a macro of the form:

function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    @yakc ()->!isa(b, Float64) ? return a : return nothing
end

but that is not implemented yet. At the moment, the codeinfo needs
to be spliced in manually (here we're abusing code_lowered to construct
us a CodeInfo of the appropriate form), e.g.:

julia> bar() = 1
bar (generic function with 1 method)

julia> ci = @code_lowered bar();

julia> @eval function foo1()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci))
       end
foo1 (generic function with 1 method)

julia> @eval function foo2()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci))
          f()
       end
foo2 (generic function with 1 method)

julia> foo1()
Core.YAKC{Tuple{},Int64}(nothing, CodeInfo(
1 ─     return 1
))

julia> foo1()()
1

julia> @code_typed foo2()
CodeInfo(
1 ─     return 1
) => Int64

julia> struct Test
           a::Int
           b::Int
       end

julia> (a::Test)() = getfield(a, 1) + getfield(a, 2)

julia> ci2 = @code_lowered Test(1, 2)();

julia> @eval function foo2()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), (1, 2), ci2))
          f()
       end
foo2 (generic function with 1 method)

julia> @code_typed foo2()
CodeInfo(
1 ─     return 3
) => Int64

TODO:

  • Show that this actually helps the Zygote use case I care about
  • Frontend support
  • Better optimizations (detection of need for recursive inlining)
  • Codegen for yakcs (right now they're interpreted)

@Keno Keno requested review from vtjnash, JeffBezanson and jrevels March 5, 2019 03:36
@Keno
Copy link
Member Author

Keno commented Mar 6, 2019

I've tried this out with Zygote and there is a lot of things I like about it. The biggest problem I see at this point and (as I found out) in its current form doesn't work too well to solve the nested AD problem (for nested AD, you'd generally want to co-optimize and in this proposal there isn't a great place to stash that CodeInfo). I think it is quite feasible to modify this to split out an object that encapsulates the set of functions that need to be co-optimized (jl_yakc_herd_t?) which would then be the natural place to cache the transformed versions (and e.g. for the second order derivative, you'd co-optimize all the various codeinfos up to second order). For our own reference attaching the state of the white board at the end of the discussion about this:

img_20190306_152711

I'll need to let this stew for a little while longer.

@Keno
Copy link
Member Author

Keno commented Sep 14, 2020

in its current form doesn't work too well to solve the nested AD problem

The nested AD problem is resolved by switching the representation.

I'm planning to revive this branch, so if there are comments, let me know.

@Keno
Copy link
Member Author

Keno commented Sep 17, 2020

Capturing some discussion with @vtjnash and @JeffBezanson:

The question we were discussing is what to do with inferred return types for YAKC.
Right now, the way I have it set up is that YAKC has type parameters YAKC{A,R}
for the YAKC's argument and return types and the optimizer is allowed to adjust the
R parameter based on inference results. Inference dependent program behavior
(like return_type) is a general cause for concern, so we discussed various options.
In the end we arrived at the following:

  1. Rather than punning on Expr(:new), there will be a dedicated yakc constructor yakc(env, lb, ub, ci) where the implementation (interpreter, optimizer, etc.) is allowed to chose the return type between lb and ub according to whaterver rules it likes.
  2. YAKC will have a custom lattice element that has full provenance information and can use the CodeInfo to do inference at the proper call site. It will not use the type bounds for this.
  3. During the same inference session, we may not lift the inference result into the type domain. This is because the optimizer is what chooses the ultimate representation and optimization is logically (and depending on when the choice happens, actually), sequenced after inference, so inference may not try to use such information.
  4. It is nevertheless ok to specialize on the type parameter afterwards if the yakc escapes from its optimization context (but inference may no longer look at the CodeInfo, since it lost provenance information at that point).

@Keno Keno changed the title Very WIP: RFC: Add an alternative implementation of closures WIP: RFC: Add an alternative implementation of closures Sep 21, 2020
@Keno Keno force-pushed the kf/yakc branch 3 times, most recently from 97da538 to 3083fcb Compare September 22, 2020 00:20
@Keno
Copy link
Member Author

Keno commented Sep 24, 2020

One remaining question is what to call this thing. Some options:

  • delayed closure
  • opaque closure
  • thunk
  • optic closure
  • dynamic closure
  • arrow closure
  • nnclosure
  • Just keep yakc

@oxinabox
Copy link
Contributor

opaque closure

I think this one.

I guess users would creat these via @opaque ()->x+y, or similar?


A two birds with one stone option might be to make them also disallow rebinding variables so as to make boxing impossible.
Then could take FastClosure.jl's @closure macro, since it would probably take over all uses of FastClosure.jl

@StefanKarpinski
Copy link
Member

Calling it a closure when it captures variables by value would be rather perverse: the defining characteristic of a closure is that it closes over bindings. It should be called @nonclosure in that case.

@Keno
Copy link
Member Author

Keno commented Sep 29, 2020

Capture vs non-capture is a frontend decision. I was planning to use the same lowering, meaning that bindings are captured by references and Boxes are introduced as they are now. The difference is that the optimizer would be allowed to optimize it back to a value capture if it can prove the dynamic semantics are unnecessary (since the opaqueness of the environment guarantees that nobody else is allowed to look at the box).

@Keno Keno changed the title WIP: RFC: Add an alternative implementation of closures RFC: Add an alternative implementation of closures Sep 30, 2020
@Keno
Copy link
Member Author

Keno commented Sep 30, 2020

Alright, please take a look at test/yakc.jl and see if everyone likes the semantics. Frontend support is still missing, but @JeffBezanson said he'd do that for me and I since I don't need it for my use cases, I'll leave that out from this PR for the moment.

Obviously the implementation is primitive right now, lacks some of the corner cases, and most optimizations. However, it does pass the test/yakc.jl test, which I think covers (basic versions of) all the major different features I want it to have, so I'm reasonably confident that this is reasonable, so I'd like to get this on the path to get merged, and then iterate on it on master, as I try to use it for my actual application.

So do we like opaque closure then? If so, I'll do that rename tomorrow.

@JeffreySarnoff
Copy link
Contributor

Naming this YAKC would help the reader avoid misunderstanding by clarifying their unknowing. Another approach to helping the reader would name this opaque closure type OpaqueClosure.

@StefanKarpinski
Copy link
Member

Opaque closure seems good to me.

@tpapp
Copy link
Contributor

tpapp commented Sep 30, 2020

DarkClosure would be shorter, and kind of cool.

@StefanKarpinski
Copy link
Member

Or delayed closure; less cool but more accurate, imo.

@JeffreySarnoff
Copy link
Contributor

delayed closure is better, good suggestion
for the cool kids: now delay is performant

@oxinabox
Copy link
Contributor

oxinabox commented Sep 30, 2020

Delayed Closure feels like it is exposing an implementation detail.

The opaqueness is the user facing difference.
Isn't it?

@JeffreySarnoff
Copy link
Contributor

everyone is correct -- what to do, what to do :)

@JeffBezanson
Copy link
Member

JeffBezanson commented Sep 30, 2020

I like OpaqueClosure as well; the type hides what is called (and is intended to hide the captured variables to the extent possible), which is a key difference with our current closures.

@JeffreySarnoff
Copy link
Contributor

great!

@Keno
Copy link
Member Author

Keno commented Sep 30, 2020

Alright, I feel like "opaque closures" have it then.

@@ -773,6 +773,24 @@ function argtype_tail(argtypes::Vector{Any}, i::Int)
return argtypes[i:n]
end

function _yakc_tfunc(@nospecialize(arg), @nospecialize(lb), @nospecialize(ub),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to tfuncs.jl?

# Give calls to YAKCs zero cost. The plan is for these to be a single
# indirect call so have very little cost. On the other hand, there
# is enormous benefit to inlining these into a function where we can
# see the definition of the YAKC. Perhaps this should even be negative
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to encourage inlining a yakc into its caller, or the function containing the call into its caller? (perhaps both?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both, e.g. consider:

call(f::YAKC) = f()
bar() = call(_yakc(...))

We would like to elide the allocation of the yakc and just inline the whatever the code the yakc would have had right into the caller, but in order to be able to do that, we need to see both the call and the definition to the yakc, so a high inline penalty for calling a yakc would be counterproductive.

elseif isa(calltype.ci, Method)
m = calltype.ci
stmt.args[5] = m
stmt.args[3] = stmt.args[4] = widenconst(tmerge(lb, m.unspecialized.cache.rettype))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is m.unspecialized guaranteed to be defined here? (Maybe I just need to keep reading)

@@ -25,6 +26,8 @@ function inflate_ir(ci::CodeInfo, sptypes::Vector{Any}, argtypes::Vector{Any})
elseif isa(stmt, Expr) && stmt.head === :enter
stmt.args[1] = block_for_inst(cfg, stmt.args[1])
code[i] = stmt
elseif isa(stmt, Expr) && stmt.head == :new
code[i] = stmt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase artifact

src/datatype.c Outdated
@@ -930,8 +930,21 @@ static void init_struct_tail(jl_datatype_t *type, jl_value_t *jv, size_t na)
JL_DLLEXPORT jl_value_t *jl_new_structv(jl_datatype_t *type, jl_value_t **args, uint32_t na)
{
jl_ptls_t ptls = jl_get_ptls_states();
if (!jl_is_datatype(type) || type->layout == NULL)
if (!jl_is_datatype(type) || type->layout == NULL) {
// As a special case we're allowed to have unionalls over yakc,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we deal with this elsewhere? It would be better if new only has to worry about allocating the type passed to it, instead of also determining what type to allocate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry. This is outdated and can be removed.

 # Overview

This PR is a sketch implementation of an alternative mechanism of
implementing closures. It is designed to be complimenatry to the
existing closure mechanism and makes some different trade offs.
The motivation for this mechanism comes primarily from closure-based
AD tools like Zygote, but I'm expecting it will find other use cases
as well. It's a little hard to name this, because it's just another
way to implement closures. I discussed this with jeff and options
that were considered were "Arrow Closures" or "Non-Nominal Closures",
but for now I'm just calling them "Yet Another Kind of Closure" (YAKC,
pronounced yak-c, rhymes with yahtzee). This PR is in very early
stages, so in order to explain what this is and what this does, I
may describe features that are not yet implemented. See the end of the
commit message to see the current status.

 # Motivation
 ## Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->isa(b, Float64) ? return a : return nothing
end
```

now, the traditional closure mechanism will lower this to:

```
struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = isa(b, Float64) ? return a : return nothing
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end
```

the problem with this is apparent: Even though after inference,
we know that `a` is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

 ## Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this (it is somewhat possible by constructing an object
that redoes the primal analysis, but it's awkward at the very least).
This provides a very straightforward implementation of this feature.

 # Mechanism

The primary concept introduced by this PR is the `YAKC` type, defined
as follows:
```
struct YAKC{A <: Tuple, R}
     env::Any
     ci::CodeInfo
 end

 function (y::YAKC{A, R})(args...) where {A,R}
     typeassert(args, A)
     ccall(:jl_invoke_yakc, Any, (Any, Any), y, args)::R
 end
```
The dynamic semantics are that calling the yakc will run whatever code
is stored in the `.ci` object, using `env` as the self argument. This
is augmented by special support in inference and the optimizer to
co-optimize yakcs that appear in bodies of functions along with their
containing functions in order to enable things like cross-closure DCE.

Note that argument types and return types are explicitly specified,
rather than inferred. The reason for this is to prevent YAKCs from
participating in inference cycles and to allow return-type information
to be provided to inference without having to look at the contained
CodeInfo (because the contents of the CodeInfo are not interprocedurally
valid and may in general depend on what the optimizer was able to
figure out about the program). This is also done with a few to an
extension where the CodeInfo inside the yakc is not generated until
later in the optimization pipeline. It would be possible to optionally
allow return-type inference of the yatc based on the unoptimized
CodeInfo (if available), but that is not currently within the scope
of my planned work in this PR.

 # Status

The PR has the bare bones support for yakcs, including some initial
support for inling and inference, though that support is known to
be incorrect and incomplete. There are also currently no nice
front-end forms. For explanatory and testing purposes, I would like to
provide a macro of the form:
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    @yakc ()->isa(b, Float64) ? return a : return nothing
end
```
but that is not implemented yet. At the moment, the codeinfo needs
to be spliced in manually (here we're abusing `code_lowered` to construct
us a CodeInfo of the appropriate form), e.g.:
```
julia> bar() = 1
bar (generic function with 1 method)

julia> ci = @code_lowered bar();

julia> @eval function foo1()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci))
       end
foo1 (generic function with 1 method)

julia> @eval function foo2()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), nothing, ci))
          f()
       end
foo2 (generic function with 1 method)

julia> foo1()
Core.YAKC{Tuple{},Int64}(nothing, CodeInfo(
1 ─     return 1
))

julia> foo1()()
1

julia> @code_typed foo2()
CodeInfo(
1 ─     return 1
) => Int64

julia> struct Test
           a::Int
           b::Int
       end

julia> (a::Test)() = getfield(a, 1) + getfield(a, 2)

julia> ci2 = @code_lowered Test(1, 2)();

julia> @eval function foo2()
          f = $(Expr(:new, :(Core.YAKC{Tuple{}, Int64}), (1, 2), ci2))
          f()
       end
foo2 (generic function with 1 method)

julia> @code_typed foo2()
CodeInfo(
1 ─     return 3
) => Int64
```

TODO:
 - [ ] Show that this actually helps the Zygote use case I care about
 - [ ] Frontend support
 - [ ] Better optimizations (detection of need for recursive inlining)
 - [ ] Codegen for yakcs (right now they're interpreted)
@Keno
Copy link
Member Author

Keno commented Oct 2, 2020

I've done the rename. I'll push it up to a new PR, so that we can have a clean place to do any review and get it merged, while keeping this history here.

@Keno Keno closed this Oct 2, 2020
Keno added a commit that referenced this pull request Oct 2, 2020
This is the end result of the design process in #31253.

 # Overview

This PR implements a new kind of closure, called an `opaque closure`.
It is designed to be complimenatry to the existing closure mechanism
and makes some different trade offs. The motivation for this mechanism
comes primarily from closure-based AD tools, but I'm expecting it will
find other use cases as well. From the end user perspective, opaque
closures basically behave like regular closures, expect that they
are introduced by adding the `@opaque` macro (not part of this PR,
but will be added after). In front of the existing closure. In
particular, all scoping, capture, etc. rules are identical. For
such user written closures, the primary difference is in the
performance characteristics. In particular:

1) Passing an opaque closure to a high order function will specialize
   on the argument and return types of the closure, but not on the
   closure identity. (This also means that the opaque closure will not
   be eligible for inlining into the higher order function, unless the
   inliner can see both the definition and the call site).
2) The optimizer is allowed to modify the capture environment of the
   opaque closure (e.g. dropping unused captures, or reducing `Box`ed
   values back to value captures).

The `opaque` part of the naming comes from the notion that semantically,
nothing is supposed to inspect either the code or the capture environment
of the opaque closure, since the optimizer is allowed to choose any value
for these that preserves the behavior of calling the opaque closure itself.

 # Motivation
 ## Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->isa(b, Float64) ? return nothing : return a
end
```

now, the traditional closure mechanism will lower this to:

```
struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end
```

the problem with this is apparent: Even though (after inference),
we know that `a` is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

 ## Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this. This provides a very straightforward implementation
of this feature, as opaque closures may be inserted at any point during
the compilation process (unlike types, which may only be inserted
by the frontend).

 # Mechanism

The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}`
type, constructed, by the new `Core._opaque_closure` builtin, with
the following signature:

```
    _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...)

Create a new OpaqueClosure taking arguments specified by the types `argt`. When called,
this opaque closure will execute the source specified in `source`. The `lb` and `ub`
arguments constrain the return type of the opaque closure. In particular, any return
value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If
the optimizer runs, it may replace `R` by the narrowest possible type inference
was able to determine. To guarantee a particular value of `R`, set lb===ub.
```

Captures are available to the CodeInfo as `getfield` from Slot 1
(referenced by position).

 # Examples

I think the easiest way to understand opaque closures is look through
a few examples. These make use of the `@opaque` macro which isn't
implemented yet, but makes understanding the semantics easier.
Some of these examples, in currently available syntax can be seen
in test/opaque_closure.jl

```
oc_trivial() = @opaque ()::Any->1
@show oc_trivial() # ()::Any->◌
@show oc_trivial()() # 1

oc_inf() = @opaque ()->1
 # Int return type is inferred
@show oc_inf() # ()::Int->◌
@show oc_inf()() # 1

function local_call(b::Int)
    f = @opaque (a::Int)->a + b
    f(2)
end

oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B
@show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌
@show sizeof(oc_capture_opt([1; 2]).env) # 0
@show oc_capture_opt([1 2])([3 4]) # [6 8]
```
@Keno Keno mentioned this pull request Oct 2, 2020
Keno added a commit that referenced this pull request Oct 13, 2020
This is the end result of the design process in #31253.

 # Overview

This PR implements a new kind of closure, called an `opaque closure`.
It is designed to be complimenatry to the existing closure mechanism
and makes some different trade offs. The motivation for this mechanism
comes primarily from closure-based AD tools, but I'm expecting it will
find other use cases as well. From the end user perspective, opaque
closures basically behave like regular closures, expect that they
are introduced by adding the `@opaque` macro (not part of this PR,
but will be added after). In front of the existing closure. In
particular, all scoping, capture, etc. rules are identical. For
such user written closures, the primary difference is in the
performance characteristics. In particular:

1) Passing an opaque closure to a high order function will specialize
   on the argument and return types of the closure, but not on the
   closure identity. (This also means that the opaque closure will not
   be eligible for inlining into the higher order function, unless the
   inliner can see both the definition and the call site).
2) The optimizer is allowed to modify the capture environment of the
   opaque closure (e.g. dropping unused captures, or reducing `Box`ed
   values back to value captures).

The `opaque` part of the naming comes from the notion that semantically,
nothing is supposed to inspect either the code or the capture environment
of the opaque closure, since the optimizer is allowed to choose any value
for these that preserves the behavior of calling the opaque closure itself.

 # Motivation
 ## Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->isa(b, Float64) ? return nothing : return a
end
```

now, the traditional closure mechanism will lower this to:

```
struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end
```

the problem with this is apparent: Even though (after inference),
we know that `a` is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

 ## Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this. This provides a very straightforward implementation
of this feature, as opaque closures may be inserted at any point during
the compilation process (unlike types, which may only be inserted
by the frontend).

 # Mechanism

The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}`
type, constructed, by the new `Core._opaque_closure` builtin, with
the following signature:

```
    _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...)

Create a new OpaqueClosure taking arguments specified by the types `argt`. When called,
this opaque closure will execute the source specified in `source`. The `lb` and `ub`
arguments constrain the return type of the opaque closure. In particular, any return
value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If
the optimizer runs, it may replace `R` by the narrowest possible type inference
was able to determine. To guarantee a particular value of `R`, set lb===ub.
```

Captures are available to the CodeInfo as `getfield` from Slot 1
(referenced by position).

 # Examples

I think the easiest way to understand opaque closures is look through
a few examples. These make use of the `@opaque` macro which isn't
implemented yet, but makes understanding the semantics easier.
Some of these examples, in currently available syntax can be seen
in test/opaque_closure.jl

```
oc_trivial() = @opaque ()::Any->1
@show oc_trivial() # ()::Any->◌
@show oc_trivial()() # 1

oc_inf() = @opaque ()->1
 # Int return type is inferred
@show oc_inf() # ()::Int->◌
@show oc_inf()() # 1

function local_call(b::Int)
    f = @opaque (a::Int)->a + b
    f(2)
end

oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B
@show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌
@show sizeof(oc_capture_opt([1; 2]).env) # 0
@show oc_capture_opt([1 2])([3 4]) # [6 8]
```
JeffBezanson pushed a commit that referenced this pull request Oct 14, 2020
This is the end result of the design process in #31253.

 # Overview

This PR implements a new kind of closure, called an `opaque closure`.
It is designed to be complimenatry to the existing closure mechanism
and makes some different trade offs. The motivation for this mechanism
comes primarily from closure-based AD tools, but I'm expecting it will
find other use cases as well. From the end user perspective, opaque
closures basically behave like regular closures, expect that they
are introduced by adding the `@opaque` macro (not part of this PR,
but will be added after). In front of the existing closure. In
particular, all scoping, capture, etc. rules are identical. For
such user written closures, the primary difference is in the
performance characteristics. In particular:

1) Passing an opaque closure to a high order function will specialize
   on the argument and return types of the closure, but not on the
   closure identity. (This also means that the opaque closure will not
   be eligible for inlining into the higher order function, unless the
   inliner can see both the definition and the call site).
2) The optimizer is allowed to modify the capture environment of the
   opaque closure (e.g. dropping unused captures, or reducing `Box`ed
   values back to value captures).

The `opaque` part of the naming comes from the notion that semantically,
nothing is supposed to inspect either the code or the capture environment
of the opaque closure, since the optimizer is allowed to choose any value
for these that preserves the behavior of calling the opaque closure itself.

 # Motivation
 ## Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->isa(b, Float64) ? return nothing : return a
end
```

now, the traditional closure mechanism will lower this to:

```
struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end
```

the problem with this is apparent: Even though (after inference),
we know that `a` is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

 ## Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this. This provides a very straightforward implementation
of this feature, as opaque closures may be inserted at any point during
the compilation process (unlike types, which may only be inserted
by the frontend).

 # Mechanism

The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}`
type, constructed, by the new `Core._opaque_closure` builtin, with
the following signature:

```
    _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...)

Create a new OpaqueClosure taking arguments specified by the types `argt`. When called,
this opaque closure will execute the source specified in `source`. The `lb` and `ub`
arguments constrain the return type of the opaque closure. In particular, any return
value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If
the optimizer runs, it may replace `R` by the narrowest possible type inference
was able to determine. To guarantee a particular value of `R`, set lb===ub.
```

Captures are available to the CodeInfo as `getfield` from Slot 1
(referenced by position).

 # Examples

I think the easiest way to understand opaque closures is look through
a few examples. These make use of the `@opaque` macro which isn't
implemented yet, but makes understanding the semantics easier.
Some of these examples, in currently available syntax can be seen
in test/opaque_closure.jl

```
oc_trivial() = @opaque ()::Any->1
@show oc_trivial() # ()::Any->◌
@show oc_trivial()() # 1

oc_inf() = @opaque ()->1
 # Int return type is inferred
@show oc_inf() # ()::Int->◌
@show oc_inf()() # 1

function local_call(b::Int)
    f = @opaque (a::Int)->a + b
    f(2)
end

oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B
@show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌
@show sizeof(oc_capture_opt([1; 2]).env) # 0
@show oc_capture_opt([1 2])([3 4]) # [6 8]
```
Keno added a commit that referenced this pull request Dec 17, 2020
This is the end result of the design process in #31253.

 # Overview

This PR implements a new kind of closure, called an `opaque closure`.
It is designed to be complimenatry to the existing closure mechanism
and makes some different trade offs. The motivation for this mechanism
comes primarily from closure-based AD tools, but I'm expecting it will
find other use cases as well. From the end user perspective, opaque
closures basically behave like regular closures, expect that they
are introduced by adding the `@opaque` macro (not part of this PR,
but will be added after). In front of the existing closure. In
particular, all scoping, capture, etc. rules are identical. For
such user written closures, the primary difference is in the
performance characteristics. In particular:

1) Passing an opaque closure to a high order function will specialize
   on the argument and return types of the closure, but not on the
   closure identity. (This also means that the opaque closure will not
   be eligible for inlining into the higher order function, unless the
   inliner can see both the definition and the call site).
2) The optimizer is allowed to modify the capture environment of the
   opaque closure (e.g. dropping unused captures, or reducing `Box`ed
   values back to value captures).

The `opaque` part of the naming comes from the notion that semantically,
nothing is supposed to inspect either the code or the capture environment
of the opaque closure, since the optimizer is allowed to choose any value
for these that preserves the behavior of calling the opaque closure itself.

 # Motivation
 ## Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->isa(b, Float64) ? return nothing : return a
end
```

now, the traditional closure mechanism will lower this to:

```
struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end
```

the problem with this is apparent: Even though (after inference),
we know that `a` is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

 ## Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this. This provides a very straightforward implementation
of this feature, as opaque closures may be inserted at any point during
the compilation process (unlike types, which may only be inserted
by the frontend).

 # Mechanism

The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}`
type, constructed, by the new `Core._opaque_closure` builtin, with
the following signature:

```
    _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...)

Create a new OpaqueClosure taking arguments specified by the types `argt`. When called,
this opaque closure will execute the source specified in `source`. The `lb` and `ub`
arguments constrain the return type of the opaque closure. In particular, any return
value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If
the optimizer runs, it may replace `R` by the narrowest possible type inference
was able to determine. To guarantee a particular value of `R`, set lb===ub.
```

Captures are available to the CodeInfo as `getfield` from Slot 1
(referenced by position).

 # Examples

I think the easiest way to understand opaque closures is look through
a few examples. These make use of the `@opaque` macro which isn't
implemented yet, but makes understanding the semantics easier.
Some of these examples, in currently available syntax can be seen
in test/opaque_closure.jl

```
oc_trivial() = @opaque ()::Any->1
@show oc_trivial() # ()::Any->◌
@show oc_trivial()() # 1

oc_inf() = @opaque ()->1
 # Int return type is inferred
@show oc_inf() # ()::Int->◌
@show oc_inf()() # 1

function local_call(b::Int)
    f = @opaque (a::Int)->a + b
    f(2)
end

oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B
@show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌
@show sizeof(oc_capture_opt([1; 2]).env) # 0
@show oc_capture_opt([1 2])([3 4]) # [6 8]
```
Keno added a commit that referenced this pull request Dec 25, 2020
This is the end result of the design process in #31253.

 # Overview

This PR implements a new kind of closure, called an `opaque closure`.
It is designed to be complimenatry to the existing closure mechanism
and makes some different trade offs. The motivation for this mechanism
comes primarily from closure-based AD tools, but I'm expecting it will
find other use cases as well. From the end user perspective, opaque
closures basically behave like regular closures, expect that they
are introduced by adding the `@opaque` macro (not part of this PR,
but will be added after). In front of the existing closure. In
particular, all scoping, capture, etc. rules are identical. For
such user written closures, the primary difference is in the
performance characteristics. In particular:

1) Passing an opaque closure to a high order function will specialize
   on the argument and return types of the closure, but not on the
   closure identity. (This also means that the opaque closure will not
   be eligible for inlining into the higher order function, unless the
   inliner can see both the definition and the call site).
2) The optimizer is allowed to modify the capture environment of the
   opaque closure (e.g. dropping unused captures, or reducing `Box`ed
   values back to value captures).

The `opaque` part of the naming comes from the notion that semantically,
nothing is supposed to inspect either the code or the capture environment
of the opaque closure, since the optimizer is allowed to choose any value
for these that preserves the behavior of calling the opaque closure itself.

 # Motivation
 ## Optimization across closure boundaries

Consider the following situation (type annotations are inference
results, not type asserts)
```
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    ()->isa(b, Float64) ? return nothing : return a
end
```

now, the traditional closure mechanism will lower this to:

```
struct ###{T, S}
   a::T
   b::S
end
(x::###{T,S}) = isa(x.b, Float64) ? return nothing : return x.a
function foo()
    a = expensive_but_effect_free()::Any
    b = something()::Float64
    new(a, b)
end
```

the problem with this is apparent: Even though (after inference),
we know that `a` is unused in the closure (and thus would be
able to delete the expensive call were it not for the capture),
we may not delete it, simply because we need to satisfy the full
capture list of the closure. Ideally, we would like to have a mechanism
where the optimizer may modify the capture list of a closure in
response to information it discovers.

 ## Closures from Casette transforms

Compiler passes like Zygote would like to generate new closures
from untyped IR (i.e. after the frontend runs) (and in the future
potentially typed IR also). We currently do not have a great mechanism
to support this. This provides a very straightforward implementation
of this feature, as opaque closures may be inserted at any point during
the compilation process (unlike types, which may only be inserted
by the frontend).

 # Mechanism

The primary concept introduced by this PR is the `OpaqueClosure{A<:Tuple, R}`
type, constructed, by the new `Core._opaque_closure` builtin, with
the following signature:

```
    _opaque_closure(argt::Type{<:Tuple}, lb::Type, ub::Type, source::CodeInfo, captures...)

Create a new OpaqueClosure taking arguments specified by the types `argt`. When called,
this opaque closure will execute the source specified in `source`. The `lb` and `ub`
arguments constrain the return type of the opaque closure. In particular, any return
value of type `Core.OpaqueClosure{argt, R} where lb<:R<:ub` is semantically valid. If
the optimizer runs, it may replace `R` by the narrowest possible type inference
was able to determine. To guarantee a particular value of `R`, set lb===ub.
```

Captures are available to the CodeInfo as `getfield` from Slot 1
(referenced by position).

 # Examples

I think the easiest way to understand opaque closures is look through
a few examples. These make use of the `@opaque` macro which isn't
implemented yet, but makes understanding the semantics easier.
Some of these examples, in currently available syntax can be seen
in test/opaque_closure.jl

```
oc_trivial() = @opaque ()::Any->1
@show oc_trivial() # ()::Any->◌
@show oc_trivial()() # 1

oc_inf() = @opaque ()->1
 # Int return type is inferred
@show oc_inf() # ()::Int->◌
@show oc_inf()() # 1

function local_call(b::Int)
    f = @opaque (a::Int)->a + b
    f(2)
end

oc_capture_opt(A) = @opaque (B::typeof(A))->ndims(A)*B
@show oc_capture_opt([1; 2]) # (::Vector{Int},)::Vector{Int}->◌
@show sizeof(oc_capture_opt([1; 2]).env) # 0
@show oc_capture_opt([1 2])([3 4]) # [6 8]
```
@DilumAluthge DilumAluthge deleted the kf/yakc branch March 25, 2021 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants