Pretty printing of expression applications with infix operators #1259

IwanKaramazow · 2017-05-12T17:10:17Z

Improve printing of chained |> (#1066, #1045)

Result

let x = foo |> z;

let x =
  foo
  |> f
  |> g;

let x =
  foo
  |> somelongfunctionname "foo"
  |> anotherlongfunctionname "bar" 1
  |> somelongfunction
  |> bazasdasdad;

let code =
  JSCodegen.Code.(
    create
    |> lines
         Requires.(
           create
           |> import_type local::"Set" source::"Set"
           |> import_type local::"Map" source::"Map"
           |> import_type local::"Immutable" source::"immutable"
           |> require local::"invariant" source::"invariant"
           |> require local::"Image" source::"Image.react"
           |> side_effect source::"monkey_patches"
           |> render_lines
         )
    |> new_line
    |> new_line
    |> new_line
    |> new_line
    |> render
  );

let code = JSCodegen.Code.(create |> render);

Overview

/* (1) */
let x = foo |> f; 

/* (2) */
let x = 
  foo
  |> f
  |> g;

We have two cases here: the first shouldn't break, the second should always break.
Note that |> results in some sort of recursive structure 'going deeper' on the 'left side' of the printed result.
The printing is some sort of left fold.

let x =  foo |> f |> g;

/* is going to be parsed as: */
((foo |> f) |> g)

This is important because at a certain moment in the unparsing recursion,
we're going to see foo |> f. This is exactly the same as the first case!
If we just recursively print everything where the last one is treated as the first case,
we would get:

let x =
  foo |> f
  |> g;

How do we make a difference between the two cases?

(* list containing all infix operators with the 'left-fold' printing behaviour *)
let leftRecInfixOperators = ["|>"]

While unparsing an expression, check if a printedIdent is a member of the above list.
If so, we check if the left child of the application arguments contains another
application with the same printedIdent (isLeftInfixChain)

Is (foo |> f) another Pexp_apply with a |> in ((foo |> f) |> g) ?
Yes, it is.

If so mark the unparsing with the parameter ~infixChain.
~infixChain indicates we're chaining an application with an identifier ("|>" here).
Based on the value of ~infixChain, we know we always have to break.
The innermost (foo |> f) of ((foo |> f) |> g), will break and result in the following:

let x =
  foo
  |> f
  |> g;

/* we won't get 
let x =
  foo |> f
  |> g;
*/

Future work

This PR works as advertised for |>, but it would be nice if we could standardize
printing for all infix operators. >>= & co has the same problem, but prints as a right fold instead
of a left fold. (>>= has some too-much-parens problems too).
Can we drop the leftRecInfixOperators in the future? Is there a way to detect those?
Note that we can't just take all infix operators with left associativity here, because of e.g. "1 + 2 + 3"
Printing all left associative infix application as |> would result in:

let x =
  1
  + 2
  + 3

With |>, it always breaks as soon as there's more than one |> chained.
In the case of +, we want break IfNeed style.

hcarty · 2017-05-12T17:50:52Z

formatTest/unit_tests/expected_output/infix.re

+         Requires.(
+           create
+           |> import_type
+                local::"Set" source::"Set"


Why do these labeled arguments break onto a new line?

Because of the printing width of the tests.
See Example section in my PR description for the default printing width.

I see - thank you for the explanation. The discrepancy between the PR description and the expected output is what threw me.

jordwalke · 2017-05-12T21:57:06Z

/* is going to be printed as: */

I think you mean "parsed" as?

jordwalke · 2017-05-12T22:05:05Z

What did you mean by this?

"Note that we can't just take all infix operators with left associativity here, because of e.g. "1 + 2 + 3"

By the way, this is a very good explanation of the problem/challenges.

jordwalke · 2017-05-12T22:59:03Z

reason-parser/src/reason_pprint_ast.ml

-  method ensureExpression expr ~reducesOnToken =
-    match self#unparseExprRecurse expr with
+  method ensureExpression ?(infixChain=false) ~reducesOnToken expr =
+    match self#unparseExprRecurse ~infixChain expr with
    | SpecificInfixPrecedence ({reducePrecedence; shiftPrecedence}, leftRecurse) ->


This would likely be easier to implement this change, as well as all the other customizations you mentioned, if the result of unparsing returned an intermediate representation of the concrete AST. That is, it would return a tree where all the precedence has already been resolved. Imagine if you had something like this:

InfixApplicationConcrete ( "|>", InfixApplicationConcrete ( "|>", SimpleConcrete itm), SimpleConcrete itm )

Where printing is trivial, and all parenthesis/precedence has already been handled.
Then you can easily do things like

computeConsecutiveLeftInfixChains (concreteAst)

which would return a convenient list for you to format in a specific way.

This division separates all the technical details of precedence and correctness from the opinionated styling of formatting.

Maybe the current FunctionApplication/Simple/SpecificInfixPrecedence could become that Concrete AST I mentioned? It's pretty close to it.

Thoughts?

I think this is a separation we want to move towards anyways - and we could use this diff as a step in that direction.

This is a very good idea. An intermediate representation could solve all the nuances of printing different infix applications. ~infixChain felt very shaky from the start, I just couldn't think of something better at the moment 😄
I'll go ahead and implement it.

Great! I think we could/should start extending that pattern through the rest of the printer. Imagine having a completely separate stage to manage all the stylistic/opinionated sections - separate from a stage that determines correct grouping according to precedence. It will be much easier for people to contribute fixes to stylistic formatting.

jordwalke

comment

IwanKaramazow · 2017-05-13T06:21:44Z

"Note that we can't just take all infix operators with left associativity here, because of e.g. "1 + 2 + 3"

I thought of printing all left associative infix application as |>, but clearly this won't with + for example.

let x =
  1
  + 2
  + 3

With |>, it always breaks as soon as there's more than one |> chained.
In the case of +, we want break IfNeed style.

jordwalke · 2017-05-13T09:18:18Z

For typical use cases of |> would it be bad to format on an IfNeed basis? I genuinely don't know.

I don't mind + wrapping like this:

let x =
  1
  + 2
  + 3

as long as it's on an IfNeed basis.

Also, I believe that >>= is actually left associative right? If so, then it is one case where we don't want the infix to dock to the left, despite being left associative.

IwanKaramazow · 2017-05-13T09:33:22Z

Should I change it to an IfNeed basis?

let uri = req |> Request.uri |> Uri.to_string;
let meth = req |> Request.meth |> Code.string_of_method;
let headers = req |> Request.headers |> Header.to_string;

In the above it absolutely makes sense, but when I'm writing

let nextState =
    state
    |> updateFlappy
    |> updateTime;

I want it to break. Hmm, food for thought.

jordwalke · 2017-05-13T09:34:48Z

I don't feel strongly. @bordoley, our resident |> connoisseur may have opinions.

bordoley · 2017-05-13T15:02:35Z

@IwanKaramazow why do you want line breaks in your nextState case?

Looking through my own code, I think the IfNeed basis would accurately layout most code the same way I would by hand.

IwanKaramazow · 2017-05-14T05:46:19Z

In the nextState case it's just for readability, and there's a big chance I might extend it. It would eventually break. Would you put it on one line? The IfNeed basis might be harder since the Auto breaking of labels here originally didn't work & spawned the issue 😄
Back to the drawing board...

jordwalke · 2017-05-14T07:54:57Z

The original issue here which was caused by IfNeed was due to the fact that they were multiple nested labels. I had originally had something kind of like the "concrete parse tree" representation that we just discussed, but I removed it in favor of this implementation (labels) to first focus on correctness, with the intent of eventually evolving it back into a kind of "concrete parse tree".
If you had such a concrete parse tree, it's very easy to then take it and construct lists of consecutive identical infix identifiers, and then format them using list IfNeed instead of label IfNeed which should eliminate many of the original problems.

chenglou · 2017-06-27T18:56:30Z

@IwanKaramazow ready? =D

IwanKaramazow · 2017-06-29T07:09:37Z

It works, I just need to rename some functions so it makes more sense. I can do that this weekend.
@jordwalke should I wait & do this on top of Fred's new diff?

chenglou · 2017-06-29T07:10:41Z

I think this should get in before the breaking change, since current users can benefit. Heck this change + some if formatting change almost warrant their own release!

kyldvs · 2017-07-06T19:40:17Z

Would love to see this get merged, this is my biggest issue with the formatter at the moment :)

jordwalke · 2017-07-07T07:22:13Z

I want to make sure @let-def is aware of this one.

jordwalke

The output looks good, and I think we just need to clear up the purpose of the two kinds of infix in the concrete syntax data structure (intermediate representation).

jordwalke · 2017-07-07T07:36:01Z

formatTest/unit_tests/expected_output/infix.re

-    }
-  ) ^ "yo";
+  call "hi"
+  ^ (


Yeah, I think this diff ends up fixing a bunch of other little weird issues too.

jordwalke · 2017-07-07T07:36:13Z

formatTest/unit_tests/expected_output/infix.re

-    x + (
-      something.contents = y
-    ); /* Because of the #NotActuallyAConflict above */
+    x + (something.contents = y); /* Because of the #NotActuallyAConflict above */


Like this one.

jordwalke · 2017-07-07T08:33:46Z

reason-parser/src/reason_pprint_ast.ml

@@ -125,6 +125,7 @@ and ruleCategory =
     that it's easier just to always wrap them in parens.  *)
  | PotentiallyLowPrecedence of layoutNode


@let-def Not urgent, but in case you stumble upon this: this ruleCategory type is the closest thing to an intermediate representation that we already have to work with. You could decide to evolve it to be more sophisticated, or start fresh eventually.

jordwalke · 2017-07-07T08:39:30Z

reason-parser/src/reason_pprint_ast.ml

@@ -125,6 +125,7 @@ and ruleCategory =
     that it's easier just to always wrap them in parens.  *)
  | PotentiallyLowPrecedence of layoutNode
  | Simple of layoutNode
+  | InfixApplicationConcrete of ruleInfoData * string * ruleCategory * ruleCategory 


We discussed in chat, that this would likely take the place of SpecificInfixPrecedence. Do you still believe this is the case? Is still seems there's a large overlap in their purposes.

jordwalke · 2017-07-07T08:44:57Z

reason-parser/src/reason_pprint_ast.ml

@@ -579,9 +584,21 @@ let special_infix_strings =
 let updateToken = "="
 let requireIndentFor = [updateToken; ":="]

+(*
+ * list containing all infix operators which exhibit a 'left-fold' printing behaviour


This comment may be out of date, is it?

jordwalke · 2017-07-07T08:45:31Z

reason-parser/src/reason_pprint_ast.ml

+ * function to determine if the left child of an expression's application arguments
+ * is another Pexp_apply with the given ident
+ *)
+let isLeftInfixChain ident = function


This may no longer be used, is it?

jordwalke · 2017-07-07T08:50:24Z

reason-parser/src/reason_pprint_ast.ml

@@ -3364,39 +3437,60 @@ class printer  ()= object(self:'self)
     token "+", in `x + a * b` should not reduce until after the a * b gets
     a chance to reduce. This function would determine the minimum parens to
     ensure that. *)
-  method ensureContainingRule ~withPrecedence ~reducesAfterRight =
+  method ensureContainingRule ~withPrecedence ~reducesAfterRight () =


Off topic:

Is this just applying the best practice of always including the final ()?

I do find it a best practice because it allows you to later add optional arguments, without updating the call sites. In fact, I find it such a good idea that I think Reason should add them by default! Could you imagine?!

Yes, it has become something I always write. It makes it so much easier when revisiting/extending the code at a later point. No point in making life harder.

jordwalke

One more comment.

jordwalke · 2017-07-07T09:30:52Z

reason-parser/src/reason_pprint_ast.ml

    match self#unparseExprRecurse reducesAfterRight with
    | (SpecificInfixPrecedence ({reducePrecedence; shiftPrecedence}, rightRecurse)) ->
-      if higherPrecedenceThan shiftPrecedence withPrecedence then rightRecurse
+      if higherPrecedenceThan shiftPrecedence withPrecedence then begin 
+        Simple rightRecurse


I think this is a misnomer to wrap in Simple. This function ensureContainingRule ensures that reducesAfterRight will reduce before the containing rule with withPrecedence reduces. I realize that's pretty confusing and I have to read that sentence several times to make sense of it.

I think a way I could have wrir way to write the function definition to make it more clear would have been:

method ensureContainingRule ~withPrecedence:precedence ~reducesAfterRight:rightExpr () =

But the point of the function is to ensure that rightExpr will reduce at the proper time when it is reparsed, possibly wrapping it in parenthesis if needed. But that doesn't mean it is necessarily simple. For example, in 4 + 5*3, 5*3 will reduce before the + operation, yet 5*3 is not "simple". Simple means it is clearly one token (such as (anything) or [anything] or identifier).

I'll revisit this tomorrow.

chenglou · 2017-07-20T05:19:57Z

This got out of sync since we've re-merged back reason-parser. Would be great to have it working again! Number one request for formatting right now =)

IwanKaramazow · 2017-07-30T08:59:36Z

Rebased on master, now continuing

chenglou · 2017-07-30T09:01:14Z

This should be something we get in before the potential function syntax refactor. cc @let-def just in case

TheSpyder · 2017-08-02T03:23:55Z

What ended up happening with infix operators like >>=? I can't quite follow what happens to them in the change list.

While I agree it can't be "all left associative infix operators", any infix operator that's more than one character is probably worth applying this to.

IwanKaramazow · 2017-08-05T16:02:37Z

@TheSpyder
>>= should be printed a lot better. Check the result of some classic cohttp code:

let server = {
  let callback _conn req body => {
    let uri =
      req
      |> Request.uri
      |> Uri.to_string
      |> Code.string_of_uri
      |> Server.respond
      |> Request.uri;
    let meth =
      req
      |> Request.meth
      |> Code.string_of_method;
    let headers =
      req |> Request.headers |> Header.to_string;
    body
    |> Cohttp_lwt_body.to_string
    >|= (
          fun body =>
            Printf.sprintf
              "okokok" uri meth headers body
        )
    >>= (
          fun body =>
            Server.respond_string
              ::status ::body ()
        )
  };
  Server.create
    ::mode (Server.make ::callback ())
};

hcarty · 2017-08-05T16:20:04Z

The parens are unfortunate but it sounds like they're pretty difficult to remove.

@IwanKaramazow This is so much better that the current output - thank you!

TheSpyder · 2017-08-10T01:43:26Z

Looking very good so far.

This is getting a bit off topic for the pipe operators, so I'll log a new task if you'd prefer, but in the case of infix operators could we avoid the indent inside parens? Or perhaps just when the brackets contain a single function?

e.g.

>|= (fun body =>
      Printf.sprintf
        "okokok" uri meth headers body
    )

>>= ( fun x => a + b + etc. ) ^ | This closing paren indents on the same height as the start of >>= Looks a lot better

IwanKaramazow · 2017-08-10T19:01:16Z

@TheSpyder I addressed the weird indentation. 😁

bsansouci · 2017-08-10T19:06:03Z

formatTest/unit_tests/expected_output/infix.re

+      req |> Request.headers |> Header.to_string;
+    body
+    |> Cohttp_lwt_body.to_string
+    >|= (


Does your latest commit fix the fun body being on a new line?

Just asking, I think this PR is amazing and we should merge it before tackling on other fun problems <3

The tests have a ridiculously short print-width of 50.
If you have something like 80 you'll get:

body |> Cohttp_lwt_body.to_string >|= (fun body => Printf.sprintf "okokok" uri meth headers body) >>= (fun body => Server.respond_string ::status ::body ())

What has been fixed is the closing ) indenting on the same height as the beginning of >|= and >>=. Since it's a label now the fun body will also indent more to the left if it needs to break :)

chenglou · 2017-08-10T22:27:23Z

src/refmt_args.ml

@@ -42,7 +42,7 @@ let print =
 let print_width =
  let docv = "COLS" in
  let doc = "wrapping width for printing the AST" in
-  Arg.(value & opt (int) (100) & info ["w"; "print-width"] ~docv ~doc)
+  Arg.(value & opt (int) (90) & info ["w"; "print-width"] ~docv ~doc)


Why is this 90 now?

reason_pprint_ast.ml contains defaultSettings 90.
Not sure why I automatically changed it here to 90 lol.
I can change it back, your call.

chenglou · 2017-08-10T22:28:07Z

Soooo, which one should we merge first, @let-def's function diff, or this one? =P

hcarty · 2017-08-11T00:22:07Z

@chenglou This one! Fixes some significant paper cuts before opening up a world of fun from overhauling the entire function and constructor syntax. It will also (hopefully) keep the conversation around the big @let-def function change focused on just that change.

IwanKaramazow · 2017-08-11T07:02:31Z

@chenglou just checked, this PR & the js-syntax-branch shouldn't cause any crazy merging conflicts. Had to touch some parts of the code let-def didn't touch ⛷
So it wouldn't matter which one gets in first.

chenglou · 2017-08-11T07:40:18Z

Alright I've thought about this. We're gonna do right by the community and make sure you don't feel conned into using the function application PR just for better pipes lol.

So I'll merge this PR, make a release, before we get crazy with the function diff. cc @let-def, I'll re-rebase your diff on top of this one. This way everyone's happy.

As always, thanks @IwanKaramazow!

let-def · 2017-08-11T10:07:30Z

Thanks @chenglou!

facebook-github-bot added the CLA Signed label May 12, 2017

IwanKaramazow mentioned this pull request May 12, 2017

refmt long chain of simple |> doesn't break at all #1066

Closed

hcarty reviewed May 12, 2017

View reviewed changes

jordwalke reviewed May 12, 2017

View reviewed changes

rickyvetter mentioned this pull request Jun 28, 2017

Pipe operator formatting reasonml/reason-tools#82

Closed

IwanKaramazow force-pushed the PipeBreaking branch from b66a7c0 to 4e33acd Compare July 5, 2017 03:20

jordwalke reviewed Jul 7, 2017

View reviewed changes

hcarty mentioned this pull request Jul 29, 2017

[Formatter] Different infix operators might need to be printed differently #218

Closed

Iwan added 2 commits July 30, 2017 10:55

Always break chained pipe infix operators

678e70e

Start with implementation of more general infix printing

61b4a79

IwanKaramazow force-pushed the PipeBreaking branch from 4e33acd to 09dcbb8 Compare July 30, 2017 08:56

Document Simple ruleCategory

9716c63

Iwan added 2 commits August 5, 2017 18:10

removed unused function

e0c4be9

More documentation for ensureContainingRule

e45e1d0

Iwan added 2 commits August 9, 2017 20:51

Moment of inspiration, refactor into something that makes more sense

1af79b4

re-enable rtop test

3cbc415

Iwan added 5 commits August 10, 2017 10:04

Use same default width for refmt as defined in pprint_ast

d579caf

Use label for smoother closing paren indentation.

8745e61

>>= ( fun x => a + b + etc. ) ^ | This closing paren indents on the same height as the start of >>= Looks a lot better

Refactor common formatting in to helper fn from formatComputedInfixChain

739bbba

Documentation

d98a825

Drop weird usage of List.rev

bd2089d

IwanKaramazow changed the title ~~Always break chained pipe infix operators~~ Pretty printing of expression applications with infix operators Aug 10, 2017

bsansouci reviewed Aug 10, 2017

View reviewed changes

chenglou reviewed Aug 10, 2017

View reviewed changes

Revert default print-width to 100

a9ca925

chenglou merged commit ebc250e into reasonml:master Aug 11, 2017

This was referenced Aug 11, 2017

refmt issues with chained |> #1045

Closed

refmt doesn't break all logical expressions in a long chain of them #1065

Closed

		@@ -125,6 +125,7 @@ and ruleCategory =
		that it's easier just to always wrap them in parens. *)
		\| PotentiallyLowPrecedence of layoutNode

Pretty printing of expression applications with infix operators #1259

Pretty printing of expression applications with infix operators #1259

Conversation

IwanKaramazow commented May 12, 2017 • edited Loading

Result

Overview

Future work

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jordwalke commented May 12, 2017

jordwalke commented May 12, 2017

jordwalke May 12, 2017 • edited Loading

Choose a reason for hiding this comment

IwanKaramazow May 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jordwalke left a comment

Choose a reason for hiding this comment

IwanKaramazow commented May 13, 2017

jordwalke commented May 13, 2017

IwanKaramazow commented May 13, 2017

jordwalke commented May 13, 2017

bordoley commented May 13, 2017

IwanKaramazow commented May 14, 2017 • edited Loading

jordwalke commented May 14, 2017 • edited Loading

chenglou commented Jun 27, 2017

IwanKaramazow commented Jun 29, 2017

chenglou commented Jun 29, 2017 • edited Loading

kyldvs commented Jul 6, 2017

jordwalke commented Jul 7, 2017

jordwalke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jordwalke Jul 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jordwalke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenglou commented Jul 20, 2017

IwanKaramazow commented Jul 30, 2017

chenglou commented Jul 30, 2017

TheSpyder commented Aug 2, 2017

IwanKaramazow commented Aug 5, 2017

hcarty commented Aug 5, 2017

TheSpyder commented Aug 10, 2017 • edited Loading

IwanKaramazow commented Aug 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IwanKaramazow Aug 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenglou commented Aug 10, 2017

hcarty commented Aug 11, 2017

IwanKaramazow commented Aug 11, 2017 • edited Loading

chenglou commented Aug 11, 2017

let-def commented Aug 11, 2017

IwanKaramazow commented May 12, 2017 •

edited

Loading

jordwalke May 12, 2017 •

edited

Loading

IwanKaramazow May 13, 2017 •

edited

Loading

IwanKaramazow commented May 14, 2017 •

edited

Loading

jordwalke commented May 14, 2017 •

edited

Loading

chenglou commented Jun 29, 2017 •

edited

Loading

jordwalke Jul 7, 2017 •

edited

Loading

TheSpyder commented Aug 10, 2017 •

edited

Loading

IwanKaramazow Aug 10, 2017 •

edited

Loading

IwanKaramazow commented Aug 11, 2017 •

edited

Loading