Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: compiler-plugin: prototype builtin Cassette.jl-like overdubbing mechanism #41632

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aviatesk
Copy link
Member

@aviatesk aviatesk commented Jul 18, 2021

This is the initial prototyping of the compiler-plugin.

The proposed compiler-plugin infrastructure will consist of 3 main parts:

  1. Cassette.jl-like code overdubbing mechanism
  2. code transformation hooks
  3. AbstractInterpreter delegation hook

A compiler plugin will customize compiler behavior via the mechanisms
provided by the part 2. and 3., i.e.:

  • code transformation hooks allow a plugin to transform code at various
    stages of compilation pipeline from pre-inference to post-optimization
  • AbstractInterpreter delegation hook will enable further specialized
    compilation based on a customized abstract interpretation

Actually this PR doesn't implement the part 3. at all yet, because it
turns out the part 1. itself could be really tricky and very challenging,
and so I'd like to get feedback at this point.

The whole purpose of the part 1. is to separate a plugin's code execution
context from those of the native execution and other plugin's contexts.
A plugin should be able to transform sources of arbitrary methods and
execute arbitrary Julia code under its own semantics, but its codegen
should not influence the code cache of those original methods
(I will refer to them as "overdubbed method"). To enable the separation,
code generated by a plugin will be associated to a dedicated method
(I will refer to this method as "shadow method") so that its cache is
just separated from that of overdubbed method.

The main difference between Cassette and this PR is, while Cassette uses
@generated function, this PR doesn't. In other word, this PR tries to
implement Cassette without the issue of @generated, limited inference;
@generated can only be inferred when a call signature is fully concrete
and it leads to the problem where Cassette.jl's code transformation can't
be applied well statically when used for real-world programs, which can
contain various forms of type instabilities.
Instead of using @generated, this PR implements the overdubbing mechanism
within inference directly in a way that the overdubbing can be applied
once inference enters a plugin's context no matter how inference goes
successful. Essentially, we can make InferenceState manage plugin's
contexts and then the overdubbing mechanism can be separated from the
dispatch system that @generated relies on.

Here is how the part 1. and 2. works under the current implementation:

  1. (part 1.) when the first argument is a subtype of AbstractCompilerPlugin,
    InferenceState will compose a new plugin context and then succeeding
    inference will be done in the context
  2. (part 1.) when in a plugin context, a compiler will perform Cassette-like
    code transformation before inference so that the retrieved source
    of the original method can fit with the signature of the shadow method
    and also all the calls within the source will be overdubbed with the
    current plugin context
  3. (part 1.) when a call is overdubbed, inference will first find
    matching methods for the original signature, and then it also forms
    the equivalent method matches for the shadow method
  4. (part 1.) when inferring an overdubbed call edge, inference will
    retrieve the sources of the overdubbed matching methods, but the
    generated code will be associated with the shadow matching methods
  5. (part 2.) when in a plugin context, we apply the hooks provided by each
    plugin at each "interesting" point of the compilation pipeline

This PR comes with significant changes and still very WIP.
I really need feedbacks on this approach since I'm still not sure what
is the best way to implement the overdubbing mechanism. Any suggestion
for solving remaning issues or cleaner implementations would be very welcomed.

Here are main remaining TODOs, but there are bunch of other TODOs that I
left as comments...

  • overdubbing support
    • source retrieval from a isva method
    • overdubbing on Core._apply_iterate
    • overdubbing on opaque closure
    • make plugin entry accept arguments (rather than just accepting nullary lambda)
  • support backedges
    • add backedge from overdubbed method instance to shadow method instance;
      I tried to add the support, but couldn't make it work
    • add backedge from code transformers to shadow method instance:
      I simply didn't add this support since overdubbing is the primary
      issue at this moment
  • support non-singleton plugin context

function _is_shadow(def::Method, @nospecialize(spec_types))
def === SHADOW_METHOD || return false
ft = unwrap_unionall(spec_types).parameters[1]
return is_plugin_ft(ft)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems an odd place to split the method as attempting dispatch (aka subtyping) here to determine whether to use the plugin would be unsound, as it must use type-intersection (aka methods())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual "method match" is done within find_matching_methods, which, when in a plugin context, first finds matching methods for the original type signature and then forcibly forms corresponding method matches against the shadow method, under the assumption the shadow method signature (::PluginContext)(args...) can always match arbitrary signatures.

This is_shadow is a query to check if the already introduced MethodInstance or MethodMatch is from the shadow method and is formed from plugin context. The only purpose of this is_plugin_ft is that we need to exclude the case that Union{}(args...) matches the shadow method when not in a plugin context.

@tkf tkf requested review from vtjnash and removed request for vtjnash July 23, 2021 20:51
@aviatesk aviatesk force-pushed the avi/plugin8 branch 7 times, most recently from ec62832 to fa45a6d Compare August 23, 2021 17:50
…nism

This is an initial prototyping of [the compiler-plugin](https://hackmd.io/bVhb97Q4QTWeBQw8Rq4IFw?view).

The proposed compiler-plugin infrastructure will consist of 3 main parts:
1. [Cassette.jl](https://github.com/JuliaLabs/Cassette.jl)-like code overdubbing mechanism
2. code transformation hooks
3. `AbstractInterpreter` delegation hook

A compiler plugin will customize compiler behavior via the mechanisms
provided by the part 2. and 3., i.e.:
- code transformation hooks allows a plugin to transform code at many
  stages of compilation pipeline, from pre-inference to post-optimization
- `AbstractInterpreter` delegation hook will enable further customization,
  e.g. tweak inference heuristics, change optimization passes etc.

Actually this PR doesn't implement the part 3. at all yet, because it
turns out the part 1. itself could be really tricky and very challenging,
and so I'd like to get feedback at this point.

The whole purpose of the part 1. is to separate a plugin's code execution
context from those of the native execution or those of other plugin's.
A plugin should be able to transform sources of arbitrary methods and
execute arbitrary Julia code under its own semantics, but its codegen
should not influence the code cache of those original methods
(I will refer to this method as "overdubbed method"). To enable the
separation, code generated by a plugin will be associated to a dedicated
method (I will refer to this method as "shadow method") so that it's
just separated from that of overdubbed method.

The main difference of Cassette and this PR is, while Cassette uses
`@generated` function, this PR doesn't. In other word, this PR tries to
implement Cassette without the issue of `@generated`, _**limited inference**_;
`@generated` can only be inferred when a call signature is fully concrete
and it leads to the problem where Cassette.jl's code transformation can't
be applied well statically when used for real-world programs, which can
contain various forms of type instabilities.
Instead of using `@generated`, this PR implements the overdubbing mechanism
within inference directly in a way that the overdubbing can be applied
once inference enters a plugin's context no matter how inference goes
successful. Essentially, we can make `InferenceState` manage plugin's
contexts and then the overdubbing mechanism can be separated from the
dispatch system that `@generated` relies on.

Here is how the part 1. and 2. works under the current implementation:
1. (part 1.) when the first argument is a subtype of `AbstractCompilerPlugin`,
   `InferenceState` will _compose_ a new plugin context and then succeeding
   inference will be done in the context
2. (part 1.) when in a plugin context, a compiler will perform Cassette-like
   code transformation _before inference_ so that the retrieved source
   of the original method can fit with the signature of the shadow method
   and also all the calls within the source will be overdubbed with the
   current plugin context
3. (part 1.) when a call is overdubbed, inference will first find
   matching methods for the original signature, and then it also forms
   the equivalent method matches for the shadow method
4. (part 1.) when inferring an overdubbed call edge, inference will
   retrieve the sources of the overdubbed matching methods, but generated
   code will be associated with the shadow matching methods
5. (part 2.) when in a plugin context, we apply the hooks provided by each
   plugin at each "interesting" point of the compilation pipeline

This PR comes with significant changes but still is in very WIP.
I really need feedbacks on this approach since I'm still not sure what
is the best way to implement the overdubbing mechanism. Any suggestion
for solving remaning issues or cleaner implementations would be very welcomed.

Here are main remaining TODOs, but there are bunch of other TODOs that I
left as comments ...
- [ ] overdubbing support
  * [ ] source retrieval from a `isva` method
  * [ ] overdubbing on `Core._apply_iterate`
  * [ ] overdubbing on opaque closure
- [ ] support backedges
  * [ ] add backedge from overdubbed method instance to shadow method instance;
        I tried to add the support, but couldn't make it work
  * [ ] add backedge from code transformers to shadow method instance:
        I simply didn't add this support since overdubbing is the primary
        issue at this moment
- [ ] support non-singleton plugin context
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants