Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline comments in Ocaml are badly indented #111

Closed
GuillaumeGen opened this issue Dec 7, 2022 · 10 comments · Fixed by #526
Closed

Multiline comments in Ocaml are badly indented #111

GuillaumeGen opened this issue Dec 7, 2022 · 10 comments · Fixed by #526
Assignees
Labels
language: ocaml OCaml formatting issues P1 critical: next release type: feature request

Comments

@GuillaumeGen
Copy link
Contributor

When an Ocaml file contains multiline comments, those are badly displayed because the first line is the only indented one.

So

  (* The identity function does not have the declared type at all,
      it is just for formatting verification purpose.
  *)
  fun x -> x

can be turned into

    (* The identity function does not have the declared type at all,
      it is just for formatting verification purpose.
  *)
    fun x -> x

The reason for it, is that topiary evaluated that the function body should be indented by 2 levels whereas I indented it by only one, hence it indented the beginning of the comment with the function body, but since the whole comment is a leaf, the other lines kept the number of spaces I originally put in it.

Describe the solution you'd like
Ideally, I would love the text in the comment to be indented consistently.

If not, possible, turning multiline comments into single line ones, for which this problem does not occur, could be a workaround.

If both are not possible (and modifying a leaf is not the purpose of tree-sitter queries, so both are probably not possible), then keeping the indentation the user gave to comments would probably be less ugly than the current result.

@GuillaumeGen GuillaumeGen changed the title Multiline comments in Ocaml are Multiline comments in Ocaml are badly indented Dec 7, 2022
@aspiwack
Copy link
Member

aspiwack commented Dec 8, 2022

This sounds like something that the engine should be doing transparently.

@nbacquey nbacquey added this to the OCaml support for v0.1 milestone Jan 24, 2023
@ErinvanderVeen ErinvanderVeen removed this from the v0.1 milestone Jan 30, 2023
@Xophmeister Xophmeister added the language: ocaml OCaml formatting issues label Jan 30, 2023
@nbacquey
Copy link
Member

nbacquey commented Apr 6, 2023

Here is a minimal reproducer:

let _ = fun _ ->
  (* foo
  *)
  ()

gets formatted as:

let _ = fun _ ->
    (* foo
  *)
    ()

@ErinvanderVeen ErinvanderVeen added the P1 critical: next release label Jun 14, 2023
@torhovland
Copy link
Member

Would it be correct to say that if the first line of a multi-line comment is indented n times, then all lines of the comment must be indented n times?

Does this apply to all multi-line leaves? I suppose not, because the code may contain literal strings that can't be indented, like this:

        let s = "This
is a multi-line string
literal";

@torhovland torhovland self-assigned this Jun 19, 2023
@torhovland
Copy link
Member

A few possible approaches:

  • Use a new predicate to mark certain nodes as indented leaves. Whenever processing indenting, check if the block contains indented leaves. If so, indent each line.
  • Use a new predicate to mark certain nodes as indented leaves. When processing the predicate, add each line as a leaf. Indenting will then work as intended (no pun intended (not this one either)).

@torhovland
Copy link
Member

Idempotency is complicating things a bit. We don't want to indent twice. So I am taking all lines that may or may not begin with the correct indent level of spacing, and replacing that with the correct indent level of spacing. Which means that this:

(* A multi-line comment
that explains
  - something
  - and something else
*)

will all get indented as such:

  (* A multi-line comment
  that explains
  - something
  - and something else
  *)

@torhovland
Copy link
Member

torhovland commented Jun 20, 2023

Another example:

enum ExpandTwoLevels {
    Leaf {
    /*
     * Multi-line
     * comment
     */
        content: String,
    },
}

Consider lines 2-4 of that comment. They all begin with 5 spaces. How do we treat them? We know they are not put there as indentation by Topiary, because that would have been 8 spaces. So they are one of these:

  • wrong indentation put there by the author (should ideally be discarded)
  • meaningful spacing put there by the author (should ideally be kept)
  • maybe even a mix, because you could say that 4 of them is wrong indentation, while the fifth one is special indentation to line up the asterisks.

We don't know which it is, so the best we can do is to keep it. Which means that after indentation they get 8 + 5 spaces:

enum ExpandTwoLevels {
    Leaf {
        /*
             * Multi-line
             * comment
             */
        content: String,
    },
}

@torhovland
Copy link
Member

Should I instead decide that since the comment starts on column 4, I should clean out up to 4 spaces from the beginning of all following lines, before indenting?

@GuillaumeGen
Copy link
Contributor Author

Yes, I think that this is the best solution, if the comments starts after n spaces, then all the comment lines starting with m>=n spaces should start by m-n+indentation_level spaces, and lines starting with less than nspaces could probably be kept as is (probably, to be discussed).
Would remain the question of code likes:

let f = /* Here I start to comment,
         * But not after spaces.
         * How to deal with it?
         */
    1

@torhovland
Copy link
Member

torhovland commented Jun 21, 2023

Hi, @GuillaumeGen 🙂

Looks like what we really need is to compare which column the comment originally starts on (a) with the column we put it on after formatting (b). Then we know the indent level (b-a), and can add the same indenting to the remaining lines if positive, or remove up to a-b spaces from the beginning of each line if not.

But we don't know b until pretty-printing, so this will all have to happen there.

This will work well with idempotency, because once the comment is formatted, b-a=0.

@torhovland
Copy link
Member

I'm pretty happy with the above PR. I think it solves all the cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language: ocaml OCaml formatting issues P1 critical: next release type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants