Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: linear IR #24027

Closed
wants to merge 5 commits into from
Closed

WIP: linear IR #24027

wants to merge 5 commits into from

Conversation

JeffBezanson
Copy link
Member

@JeffBezanson JeffBezanson commented Oct 6, 2017

In short, after this a call expression can only appear as the right-hand side of an assignment to an SSAValue (or in statement position, but that might change too). Here's what I've done so far:

  • Turn on very-linear-mode (the easy part!)
  • In codevalidation.jl, implement much stricter new rules for where expressions can appear.
  • Make sure julia-syntax.scm follows those rules, including not allowing slot = Expr(:call, ...).
  • Remove the loathsome typ field from Expr 🎉 , instead using the type of a call's SSAValue.
  • Update optimization passes to preserve the linear structure.

Still to do:

  • cglobal is the only thing that causes validation failures, due to needing to see its argument symbol/tuple. Need to decide how to handle this. We can either make it a special form, or have it look inside a constant jl_cgval_t argument.
  • Make sure all optimizations are still working.
  • Update code_warntype.

The new IR is of course larger (sysimg +20%), but undeniably beautiful. Inlining is already much simpler and other passes will benefit in the future as well. I think we'll be able to make up the difference with new optimizations and clever encoding. For example, we could avoid inserting source location push/pop when inlining trivial functions. Here's a typical excerpt:

        # meta: location strings/string.jl endof 202
        # meta: location strings/string.jl sizeof 62
        SSAValue(7) = Core.sizeof
        SSAValue(8) = (SSAValue(7))(s)
        # meta: pop location
        i = SSAValue(8)
        #= line 203 =#
        8: 
        # meta: location operators.jl > 249
        # meta: location int.jl < 39
        SSAValue(10) = (Base.slt_int)(0, i)
        # meta: pop location
        # meta: pop location

We could also add a special encoding for assignment to an SSAValue.

@JeffBezanson JeffBezanson added compiler:codegen Generation of LLVM IR and native code compiler:inference Type inference labels Oct 6, 2017
@yuyichao
Copy link
Contributor

yuyichao commented Oct 6, 2017

What's the new way of getting the rhs type of

slot = call

?

@JeffBezanson
Copy link
Member Author

slot = call is not allowed.

@yuyichao
Copy link
Contributor

yuyichao commented Oct 6, 2017

Ah, I missed that. This level of SSAValue usage is going to hurt #23240 really badly... A lot of the optimizations there requires looking at the expression that is assigned to the slot and looking through multiple assignment is really hard with the current AST format.

From my experiment at #23240 as well some additional thought recently, I feel like our final goal should be using an purly SSA based IR that's similar to what LLVM use. As incremental steps, it seems that the form in #23240 (though preferably in frontend...) is relatively easy to analyse while not having to put everything in SSA. Getting any further with SSA values without removing slots altogether seems to make optimization much harder so I would prefer to do this at the same time as introducing a phi node. It will still require looking though phi node but they carry information about control flow with much more easy to analyse input which is not the case for slots...

@yuyichao
Copy link
Contributor

yuyichao commented Oct 6, 2017

Also note that phi node can be lowered back to slots without losing any information easily so that can be done after optimization so that other part of the system not ready for it doen't have to deal with it yet. My next target after #23240 was going to be a rewrite of it in a similar fashion but doing a transformation to BB's with phi node in order to explore control flow information.

@JeffBezanson
Copy link
Member Author

looking through multiple assignment is really hard with the current AST format

Why? You can look up the definition of an SSAValue in an array.

the form in #23240 (though preferably in frontend...) is relatively easy to analyse while not having to put everything in SSA

What rules would you like?

If possible, it would be nice just to change the official IR format to that needed by #23240. I strongly suspect this PR can be made to implement that.

@yuyichao
Copy link
Contributor

yuyichao commented Oct 7, 2017

Why? You can look up the definition of an SSAValue in an array.

The direct assignment gives two important information.

  1. There's a single use of the rhs
  2. There's nothing in between the assignment and the evaluation of the rhs.

Of course these are all computable when it's put in SSAValue first but it adds a lot of checks. It interferes with some logic that's very specific to the solution used in #23240 .

  1. Currently the invalid uses are kept in the list so and only being cleared out when I'm optimizing for that value so that I don't need to constantly scan though the use/def list.

    This is why I want to avoid to look through more than just a single value every time.

  2. The current code keep tracks of which value needs to be rescanned

    Looking at multiple values also make this harder.

I'd like to get rid of both of these logic in the next version and I think using a linked list like LLVM should be able to handle 1 easily. Having 1 removed and be able to look at multiple values at the same time should also make 2 easier (so that I can scan deeper and see which value is affected). If the scan of use/def become much simpler than what I have right now, the whole rescan table might not even be needed anymore.

What rules would you like?

Allow the rhs of slot assignment to be any expression. (so hold off the second commit until a better optimization pass is ready). On a related note, I feel like a good representation for optimization/type inference would just have the ssavalue, the type and the rhs be stored together since the optimization frequently need to go from one to another (ssa->type, ssa->rhs, rhs(instruction)->ssa). One can argue if the type is going with ssavalue or the rhs at that point but it does seem like a representation that's a superset of what Expr has so I don't think we have to get rid of the typ field in Expr that urgently.

@JeffBezanson
Copy link
Member Author

Allow the rhs of slot assignment to be any expression.

Ok, I think I can allow that. I believe that currently, an assignment LHS is never a TypedSlot, so we could use one of those to store the type of the RHS.

have the ssavalue, the type and the rhs be stored together

That would be fine with me --- the current representation of

Expr
  head: Symbol =
  args: Array{Any}((2,))
    1: SSAValue
      id: Int64 0
    2: Expr
      head: Symbol call
      args: Array{Any}((2,))
        1: Symbol f
        2: Symbol x

is pretty inefficient anyway. It could be something like

SSADef
  id: Int64 0
  typ: Any
  rhs: Expr
    head: Symbol call
      args: Array{Any}((2,))
        1: Symbol f
        2: Symbol x

@yuyichao
Copy link
Contributor

yuyichao commented Oct 7, 2017

It could be something like

Yes, exactly.

an assignment LHS is never a TypedSlot, so we could use one of those to store the type of the RHS.

Yeah. That works too.

@JeffBezanson
Copy link
Member Author

Closing temporarily. I'll put up a more incremental change first, but please keep this branch.

@JeffBezanson JeffBezanson mentioned this pull request Oct 13, 2017
@JeffBezanson JeffBezanson deleted the jb/linear-ir branch June 13, 2018 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code compiler:inference Type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants