Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ocamlformat #21

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Introduce ocamlformat #21

wants to merge 8 commits into from

Conversation

matijapretnar
Copy link

I am suggesting here to use ocamlformat on our codebase for the following reasons:

  • cleaner diffs & easier merges
  • faster development (no need to think hard in order to follow the existing style of the code you are modifying)
  • consistent style in general
  • probably a bunch of other reasons that people repeatedly advertise about auto formatters
    There are of course disadvantages, but I'll leave them to those who are against.

I tried finding a configuration that would match our current style as much as possible, which was not easy since we were not consistent in the first place :) The main changes I've made with regard to the default ocamlformat configuration are:

  • list sum variants on separate lines even if they all fit in a single one
  • similar for top level match/function
  • do not put spaces around lists, records, etc., so [1; 2; 3] instead of the default [ 1; 2; 3 ]
  • keep begin/end that delimit nested matches (instead of using parentheses by default)
  • do not further indent cases that do not fit in the same line as the pattern
    I am more than happy to revert any of those options to default.

You can see the resulting diff here: 84e0d97. The one change I am not entirely happy with is the formatting of comments describing sum variants (which BTW should be documentation comments, ie. (** ... *) instead of (* ... *).), which are no longer aligned, but I hope this is not a show-stopper.

The PR is currently marked as draft. If we decide to go ahead with it, I'll also add CI and precommit hooks. Otherwise, it's really easy to format the code, you just run dune fmt. In fact, it's so fast, we can run it on each make. Also, if we do it, I would suggest doing it sooner rather than later while we still have a small number of branches.

Copy link
Collaborator

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I primarily checked only one larger file (render.ml). Most changes, I could probably live with, but there are several showstoppers for me:

  • removal of multiple empty lines,
  • odd and unhelpful layout of parens spanning multiple lines,
  • change of layout of string concatenations, which is a readability killer in the rendering module,
  • what it does to some if's,
  • breaking of single-line datatypes, matches, expression sequences.

Also, this tool sometimes violates the "indent by multiples of 2" rule, e.g., when reformatting out (a; b). In fact, in one case below the indentation is negative, which looks like a straight-up bug to me.

A general problem I have with that auto-formatters is that they always follow a naive eager strategy and do not care for coherent layout across consecutive definitions/matches/branches/etc.

@@ -3,36 +3,37 @@ open Source
open El.Ast
open Config


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to prevent removal of empty lines? I intentionally use the convention of using two empty lines before the start of another code "section" in a file, in order to emphasise the structure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I see, there is no way of doing this. There is no configuration option, and having doc comments (** ... *) or doc comment section headers (** {1 ...} *) doesn't change anything.

| TypingRel
| ReductionRel

type env = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can type definitions with record r.h.s.'s be laid out in a manner that is consistent with value definitions of the same shape, such as the one below?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ed9bcb8 formats both as

type record =
  { fld1 : typ1;
    fld2 : typ2 }

and

let r =
    { fld1 = v1;
      fld2 = v2 }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now

type record =
  { fld1 : typ1;
    fld2 : typ2
  }

and

let r =
  { fld1 = v1;
    fld2 = v2
  }

due to 90b7a69

Comment on lines 15 to 17
type rel_sort =
| TypingRel
| ReductionRel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit annoying.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this refer to having a separate line for each constructor? We can write type declarations that fit in a single line compactly (for both sums and records). This is actually the default. Done in d134bc5.

| Some exps -> exps
| None -> []
in
map := Map.add id (hintexp :: exps) !map)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much prefer if the closing paren was below the opening one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I see, there is no option to do that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found it! 90b7a69

Comment on lines 123 to 128
let map_nl_list f xs =
List.map
(function
| Nl -> Nl
| Elem x -> Elem (f x))
xs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also is rather annoying, turning a one-liner into 6-liner.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again the default which I have overriden (I went with the approach that seemed more common in the existing code). Functions that fit on a single line now do so in 5ec0fe6.

| Brace -> "\\{", "\\}"
| Paren -> ("(", ")")
| Brack -> ("[", "]")
| Brace -> ("\\{", "\\}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an option to preserve absence/presence of tuple parens?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, one can only configure (separately for patterns and for expressions) if parentheses are present always or only when the tuple spreads over multiple lines.

Comment on lines 465 to 467
"\\{\\; " ^ "\\begin{array}[t]{@{}l@{}}\n"
^ concat_map_nl ",\\; " "\\\\\n " (render_typfield env) tfs
^ " \\;\\}" ^ "\\end{array}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change, like similar ones, is highly undesirable, as it makes the already hard to read code even less readable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8037caf breaks such list of expressions such that each case is in a separate line.

| CommaE (e1, e2)
| CompE (e1, e2)
| InfixE (e1, _, e2)
| FuseE (e1, e2) ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a wrap option here instead of going fully vertical?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already done by 5ec0fe6, though it doesn't work in this particular case since the line is 130 characters wide. We can increase the margin if we want.

t1.it = t2.it ||
match expand env t1, expand env t2 with
t1.it = t2.it
||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is because of a match, and couldn't find a way to change it.


error at
(phrase ^ " contains duplicate " ^ item ^ "(s) `"
^ String.concat "`, `" dups ^ "`")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's going on, but the indentation here seems buggy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was fixed by 8037caf

@matijapretnar
Copy link
Author

I have now changed what I could. The only major open issue is removal of multiple empty lines. Perhaps this can be remedied by having larger block comments such as

(************
 Section
 ************)

Copy link
Collaborator

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this more. It's much improved, but I'm afraid my main gripes are still the uncalled-for changes:

  • Empty lines: it's not just sections, I routinely use empty lines for grouping definitions. But the formatter bluntly removes all empty lines between similarly-formatted declarations, with absolutely no regards for logical program structure.

  • Coherence: I very much want related consecutive definitions to use consistent layout, even if one of them just so happens to fit on a single line.

  • Binary operators: the naive way of line-breaking binops is a pain, at least in the renderer. Now everything is vertical in most cases, but that is only marginally better in terms of logical structuring.

Of course, the fundamental problem is that handling these cases correctly requires understanding of the meaning of the code. A formatter can't understand the purpose of anything. So I wish it would at least leave these things alone.

Is it a possibility to set up the formatter on a by-directory basis? Why not let directory "owners" choose individually whether they want to trade style and readability for submission and automation?

@matijapretnar
Copy link
Author

Perhaps that is the best option. Looking at the options, you can list the files in .ocamlformat-ignore. I think it also works for directories using either /* or /**.

Alan-Liang pushed a commit to Alan-Liang/spectec that referenced this pull request Sep 1, 2024
Alan-Liang pushed a commit to Alan-Liang/spectec that referenced this pull request Sep 2, 2024
Fix a few typos in Explainer.md
@rossberg rossberg closed this in d1bcd3d Nov 11, 2024
@rossberg
Copy link
Collaborator

This issue was auto-closed by Github because its smart AI doesn't understand the concept of multiple repositories and its separate issue name spaces. Reopening.

@rossberg rossberg reopened this Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants