Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretty printing of expression applications with infix operators #1259

Merged
merged 14 commits into from
Aug 11, 2017

Conversation

IwanKaramazow
Copy link
Contributor

@IwanKaramazow IwanKaramazow commented May 12, 2017

Improve printing of chained |> (#1066, #1045)

Result

let x = foo |> z;

let x =
  foo
  |> f
  |> g;

let x =
  foo
  |> somelongfunctionname "foo"
  |> anotherlongfunctionname "bar" 1
  |> somelongfunction
  |> bazasdasdad;

let code =
  JSCodegen.Code.(
    create
    |> lines
         Requires.(
           create
           |> import_type local::"Set" source::"Set"
           |> import_type local::"Map" source::"Map"
           |> import_type local::"Immutable" source::"immutable"
           |> require local::"invariant" source::"invariant"
           |> require local::"Image" source::"Image.react"
           |> side_effect source::"monkey_patches"
           |> render_lines
         )
    |> new_line
    |> new_line
    |> new_line
    |> new_line
    |> render
  );

let code = JSCodegen.Code.(create |> render); 

Overview

/* (1) */
let x = foo |> f; 

/* (2) */
let x = 
  foo
  |> f
  |> g;

We have two cases here: the first shouldn't break, the second should always break.
Note that |> results in some sort of recursive structure 'going deeper' on the 'left side' of the printed result.
The printing is some sort of left fold.

let x =  foo |> f |> g;

/* is going to be parsed as: */
((foo |> f) |> g)

This is important because at a certain moment in the unparsing recursion,
we're going to see foo |> f. This is exactly the same as the first case!
If we just recursively print everything where the last one is treated as the first case,
we would get:

let x =
  foo |> f
  |> g;

How do we make a difference between the two cases?

(* list containing all infix operators with the 'left-fold' printing behaviour *)
let leftRecInfixOperators = ["|>"]

While unparsing an expression, check if a printedIdent is a member of the above list.
If so, we check if the left child of the application arguments contains another
application with the same printedIdent (isLeftInfixChain)

Is (foo |> f) another Pexp_apply with a |> in ((foo |> f) |> g) ?
Yes, it is.

If so mark the unparsing with the parameter ~infixChain.
~infixChain indicates we're chaining an application with an identifier ("|>" here).
Based on the value of ~infixChain, we know we always have to break.
The innermost (foo |> f) of ((foo |> f) |> g), will break and result in the following:

let x =
  foo
  |> f
  |> g;

/* we won't get 
let x =
  foo |> f
  |> g;
*/

Future work

This PR works as advertised for |>, but it would be nice if we could standardize
printing for all infix operators. >>= & co has the same problem, but prints as a right fold instead
of a left fold. (>>= has some too-much-parens problems too).
Can we drop the leftRecInfixOperators in the future? Is there a way to detect those?
Note that we can't just take all infix operators with left associativity here, because of e.g. "1 + 2 + 3"
Printing all left associative infix application as |> would result in:

let x =
  1
  + 2
  + 3

With |>, it always breaks as soon as there's more than one |> chained.
In the case of +, we want break IfNeed style.

Requires.(
create
|> import_type
local::"Set" source::"Set"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these labeled arguments break onto a new line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the printing width of the tests.
See Example section in my PR description for the default printing width.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - thank you for the explanation. The discrepancy between the PR description and the expected output is what threw me.

@jordwalke
Copy link
Member

/* is going to be printed as: */

I think you mean "parsed" as?

@jordwalke
Copy link
Member

What did you mean by this?

"Note that we can't just take all infix operators with left associativity here, because of e.g. "1 + 2 + 3"

By the way, this is a very good explanation of the problem/challenges.

method ensureExpression expr ~reducesOnToken =
match self#unparseExprRecurse expr with
method ensureExpression ?(infixChain=false) ~reducesOnToken expr =
match self#unparseExprRecurse ~infixChain expr with
| SpecificInfixPrecedence ({reducePrecedence; shiftPrecedence}, leftRecurse) ->
Copy link
Member

@jordwalke jordwalke May 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would likely be easier to implement this change, as well as all the other customizations you mentioned, if the result of unparsing returned an intermediate representation of the concrete AST. That is, it would return a tree where all the precedence has already been resolved. Imagine if you had something like this:

InfixApplicationConcrete (
  "|>",
  InfixApplicationConcrete ( "|>", SimpleConcrete itm),
  SimpleConcrete itm
)

Where printing is trivial, and all parenthesis/precedence has already been handled.
Then you can easily do things like

computeConsecutiveLeftInfixChains (concreteAst)

which would return a convenient list for you to format in a specific way.

This division separates all the technical details of precedence and correctness from the opinionated styling of formatting.

Maybe the current FunctionApplication/Simple/SpecificInfixPrecedence could become that Concrete AST I mentioned? It's pretty close to it.

Thoughts?

I think this is a separation we want to move towards anyways - and we could use this diff as a step in that direction.

Copy link
Contributor Author

@IwanKaramazow IwanKaramazow May 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good idea. An intermediate representation could solve all the nuances of printing different infix applications. ~infixChain felt very shaky from the start, I just couldn't think of something better at the moment 😄
I'll go ahead and implement it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I think we could/should start extending that pattern through the rest of the printer. Imagine having a completely separate stage to manage all the stylistic/opinionated sections - separate from a stage that determines correct grouping according to precedence. It will be much easier for people to contribute fixes to stylistic formatting.

Copy link
Member

@jordwalke jordwalke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment

@IwanKaramazow
Copy link
Contributor Author

"Note that we can't just take all infix operators with left associativity here, because of e.g. "1 + 2 + 3"

I thought of printing all left associative infix application as |>, but clearly this won't with + for example.

let x =
  1
  + 2
  + 3

With |>, it always breaks as soon as there's more than one |> chained.
In the case of +, we want break IfNeed style.

@jordwalke
Copy link
Member

For typical use cases of |> would it be bad to format on an IfNeed basis? I genuinely don't know.

I don't mind + wrapping like this:

let x =
  1
  + 2
  + 3

as long as it's on an IfNeed basis.

Also, I believe that >>= is actually left associative right? If so, then it is one case where we don't want the infix to dock to the left, despite being left associative.

@IwanKaramazow
Copy link
Contributor Author

Should I change it to an IfNeed basis?

let uri = req |> Request.uri |> Uri.to_string;
let meth = req |> Request.meth |> Code.string_of_method;
let headers = req |> Request.headers |> Header.to_string;

In the above it absolutely makes sense, but when I'm writing

let nextState =
    state
    |> updateFlappy
    |> updateTime;

I want it to break. Hmm, food for thought.

@jordwalke
Copy link
Member

I don't feel strongly. @bordoley, our resident |> connoisseur may have opinions.

@bordoley
Copy link

@IwanKaramazow why do you want line breaks in your nextState case?

Looking through my own code, I think the IfNeed basis would accurately layout most code the same way I would by hand.

@IwanKaramazow
Copy link
Contributor Author

IwanKaramazow commented May 14, 2017

In the nextState case it's just for readability, and there's a big chance I might extend it. It would eventually break. Would you put it on one line? The IfNeed basis might be harder since the Auto breaking of labels here originally didn't work & spawned the issue 😄
Back to the drawing board...

@jordwalke
Copy link
Member

jordwalke commented May 14, 2017

The original issue here which was caused by IfNeed was due to the fact that they were multiple nested labels. I had originally had something kind of like the "concrete parse tree" representation that we just discussed, but I removed it in favor of this implementation (labels) to first focus on correctness, with the intent of eventually evolving it back into a kind of "concrete parse tree".
If you had such a concrete parse tree, it's very easy to then take it and construct lists of consecutive identical infix identifiers, and then format them using list IfNeed instead of label IfNeed which should eliminate many of the original problems.

@chenglou
Copy link
Member

@IwanKaramazow ready? =D

@IwanKaramazow
Copy link
Contributor Author

It works, I just need to rename some functions so it makes more sense. I can do that this weekend.
@jordwalke should I wait & do this on top of Fred's new diff?

@chenglou
Copy link
Member

chenglou commented Jun 29, 2017

I think this should get in before the breaking change, since current users can benefit. Heck this change + some if formatting change almost warrant their own release!

@kyldvs
Copy link

kyldvs commented Jul 6, 2017

Would love to see this get merged, this is my biggest issue with the formatter at the moment :)

@jordwalke
Copy link
Member

I want to make sure @let-def is aware of this one.

Copy link
Member

@jordwalke jordwalke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output looks good, and I think we just need to clear up the purpose of the two kinds of infix in the concrete syntax data structure (intermediate representation).

}
) ^ "yo";
call "hi"
^ (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this diff ends up fixing a bunch of other little weird issues too.

x + (
something.contents = y
); /* Because of the #NotActuallyAConflict above */
x + (something.contents = y); /* Because of the #NotActuallyAConflict above */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this one.

@@ -125,6 +125,7 @@ and ruleCategory =
that it's easier just to always wrap them in parens. *)
| PotentiallyLowPrecedence of layoutNode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@let-def Not urgent, but in case you stumble upon this: this ruleCategory type is the closest thing to an intermediate representation that we already have to work with. You could decide to evolve it to be more sophisticated, or start fresh eventually.

@@ -125,6 +125,7 @@ and ruleCategory =
that it's easier just to always wrap them in parens. *)
| PotentiallyLowPrecedence of layoutNode
| Simple of layoutNode
| InfixApplicationConcrete of ruleInfoData * string * ruleCategory * ruleCategory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed in chat, that this would likely take the place of SpecificInfixPrecedence. Do you still believe this is the case? Is still seems there's a large overlap in their purposes.

@@ -579,9 +584,21 @@ let special_infix_strings =
let updateToken = "="
let requireIndentFor = [updateToken; ":="]

(*
* list containing all infix operators which exhibit a 'left-fold' printing behaviour
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment may be out of date, is it?

* function to determine if the left child of an expression's application arguments
* is another Pexp_apply with the given ident
*)
let isLeftInfixChain ident = function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may no longer be used, is it?

@@ -3364,39 +3437,60 @@ class printer ()= object(self:'self)
token "+", in `x + a * b` should not reduce until after the a * b gets
a chance to reduce. This function would determine the minimum parens to
ensure that. *)
method ensureContainingRule ~withPrecedence ~reducesAfterRight =
method ensureContainingRule ~withPrecedence ~reducesAfterRight () =
Copy link
Member

@jordwalke jordwalke Jul 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off topic:

Is this just applying the best practice of always including the final ()?

I do find it a best practice because it allows you to later add optional arguments, without updating the call sites. In fact, I find it such a good idea that I think Reason should add them by default! Could you imagine?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it has become something I always write. It makes it so much easier when revisiting/extending the code at a later point. No point in making life harder.

Copy link
Member

@jordwalke jordwalke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment.

match self#unparseExprRecurse reducesAfterRight with
| (SpecificInfixPrecedence ({reducePrecedence; shiftPrecedence}, rightRecurse)) ->
if higherPrecedenceThan shiftPrecedence withPrecedence then rightRecurse
if higherPrecedenceThan shiftPrecedence withPrecedence then begin
Simple rightRecurse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a misnomer to wrap in Simple. This function ensureContainingRule ensures that reducesAfterRight will reduce before the containing rule with withPrecedence reduces. I realize that's pretty confusing and I have to read that sentence several times to make sense of it.

I think a way I could have wrir way to write the function definition to make it more clear would have been:

method ensureContainingRule ~withPrecedence:precedence ~reducesAfterRight:rightExpr () =

But the point of the function is to ensure that rightExpr will reduce at the proper time when it is reparsed, possibly wrapping it in parenthesis if needed. But that doesn't mean it is necessarily simple. For example, in 4 + 5*3, 5*3 will reduce before the + operation, yet 5*3 is not "simple". Simple means it is clearly one token (such as (anything) or [anything] or identifier).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revisit this tomorrow.

@chenglou
Copy link
Member

This got out of sync since we've re-merged back reason-parser. Would be great to have it working again! Number one request for formatting right now =)

@IwanKaramazow
Copy link
Contributor Author

Rebased on master, now continuing

@chenglou
Copy link
Member

This should be something we get in before the potential function syntax refactor. cc @let-def just in case

@TheSpyder
Copy link

What ended up happening with infix operators like >>=? I can't quite follow what happens to them in the change list.

While I agree it can't be "all left associative infix operators", any infix operator that's more than one character is probably worth applying this to.

@IwanKaramazow
Copy link
Contributor Author

@TheSpyder
>>= should be printed a lot better. Check the result of some classic cohttp code:

let server = {
  let callback _conn req body => {
    let uri =
      req
      |> Request.uri
      |> Uri.to_string
      |> Code.string_of_uri
      |> Server.respond
      |> Request.uri;
    let meth =
      req
      |> Request.meth
      |> Code.string_of_method;
    let headers =
      req |> Request.headers |> Header.to_string;
    body
    |> Cohttp_lwt_body.to_string
    >|= (
          fun body =>
            Printf.sprintf
              "okokok" uri meth headers body
        )
    >>= (
          fun body =>
            Server.respond_string
              ::status ::body ()
        )
  };
  Server.create
    ::mode (Server.make ::callback ())
};

@hcarty
Copy link
Contributor

hcarty commented Aug 5, 2017

The parens are unfortunate but it sounds like they're pretty difficult to remove.

@IwanKaramazow This is so much better that the current output - thank you!

@TheSpyder
Copy link

TheSpyder commented Aug 10, 2017

Looking very good so far.

This is getting a bit off topic for the pipe operators, so I'll log a new task if you'd prefer, but in the case of infix operators could we avoid the indent inside parens? Or perhaps just when the brackets contain a single function?

e.g.

>|= (fun body =>
      Printf.sprintf
        "okokok" uri meth headers body
    )

@IwanKaramazow IwanKaramazow changed the title Always break chained pipe infix operators Pretty printing of expression applications with infix operators Aug 10, 2017
@IwanKaramazow
Copy link
Contributor Author

@TheSpyder I addressed the weird indentation. 😁

req |> Request.headers |> Header.to_string;
body
|> Cohttp_lwt_body.to_string
>|= (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your latest commit fix the fun body being on a new line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just asking, I think this PR is amazing and we should merge it before tackling on other fun problems <3

Copy link
Contributor Author

@IwanKaramazow IwanKaramazow Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests have a ridiculously short print-width of 50.
If you have something like 80 you'll get:

body
|> Cohttp_lwt_body.to_string
>|= (fun body => Printf.sprintf "okokok" uri meth headers body)
>>= (fun body => Server.respond_string ::status ::body ())

What has been fixed is the closing ) indenting on the same height as the beginning of >|= and >>=. Since it's a label now the fun body will also indent more to the left if it needs to break :)

@@ -42,7 +42,7 @@ let print =
let print_width =
let docv = "COLS" in
let doc = "wrapping width for printing the AST" in
Arg.(value & opt (int) (100) & info ["w"; "print-width"] ~docv ~doc)
Arg.(value & opt (int) (90) & info ["w"; "print-width"] ~docv ~doc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this 90 now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reason_pprint_ast.ml contains defaultSettings 90.
Not sure why I automatically changed it here to 90 lol.
I can change it back, your call.

@chenglou
Copy link
Member

Soooo, which one should we merge first, @let-def's function diff, or this one? =P

@hcarty
Copy link
Contributor

hcarty commented Aug 11, 2017

@chenglou This one! Fixes some significant paper cuts before opening up a world of fun from overhauling the entire function and constructor syntax. It will also (hopefully) keep the conversation around the big @let-def function change focused on just that change.

@IwanKaramazow
Copy link
Contributor Author

IwanKaramazow commented Aug 11, 2017

@chenglou just checked, this PR & the js-syntax-branch shouldn't cause any crazy merging conflicts. Had to touch some parts of the code let-def didn't touch ⛷
So it wouldn't matter which one gets in first.

@chenglou
Copy link
Member

Alright I've thought about this. We're gonna do right by the community and make sure you don't feel conned into using the function application PR just for better pipes lol.

So I'll merge this PR, make a release, before we get crazy with the function diff. cc @let-def, I'll re-rebase your diff on top of this one. This way everyone's happy.

As always, thanks @IwanKaramazow!

@let-def
Copy link
Contributor

let-def commented Aug 11, 2017

Thanks @chenglou!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants