Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the Odoc parser instead of Octavius #571

Closed
Julow opened this issue Jan 28, 2019 · 8 comments
Closed

Use the Odoc parser instead of Octavius #571

Julow opened this issue Jan 28, 2019 · 8 comments
Assignees
Milestone

Comments

@Julow
Copy link
Collaborator

Julow commented Jan 28, 2019

Octavius is the documentation comments parser. It will be deprecated soon, in favor of Odoc.

TODO

  • Rewrite the Fmt_odoc module
    The types are different, obviously, but very similar

  • Vendor odoc sources
    Odoc's parser and model internal libraries are not released yet on opam

  • Whitespaces informations are lost
    In the AST, sentenses words are splitted and consecutive whitespaces, including newlines are represented as \Space`. Newlines can be retrieved by comparing start/end locations.
    A bit of work is required to preserve ASCII-art and fancy headers

  • Small difficulties

    • Heading labels ({0:label Heading}) are not optional in the AST and so always printed
    • Deciding between heavy and light syntax for lists should be done with a better heuristic than print-parse-compare
    • Paths_types (identifiers, references) is very complex
      We only need to convert them to strings, there is (not exported) functions to do that in Odoc's html/comment.ml
@Julow
Copy link
Collaborator Author

Julow commented Jan 28, 2019

I started working on this a little, on this branch

@gpetiot gpetiot mentioned this issue Jan 28, 2019
17 tasks
@avsm
Copy link
Contributor

avsm commented Jan 28, 2019

ccing @jonludlam @aantron so they see this integration work ongoing. In particular, try to structure the odoc vendoring so that it's standalone and in a single directory so that it can be updated easily. I have a prototype called duniverse which could be cleaned up to automate the self-contained vendoring extraction into ocamlformat for this purpose.

We may also need a little bit of Dune help from @rgrinberg in order to parameterise ignoring the vendoring in some cases, but I'm not sure that'll be needed if odoc is purely a binary and not an OCaml library as well.

@jberdine
Copy link
Collaborator

Re whitespace, ascii-art, etc., I think that we need to be careful and scope down the problem. In general trying to preserve information that is not in the AST is complicated to implement and hard to make robust. I don't think that ocamlformat should attempt to format docstrings more closely to the source or make them more readable, etc. than odoc does. That is, if any particular docstring is not displayed as desired by odoc, then ocamlformat should not attempt to do differently. If such formatting is needed, then handling it will be needed in odoc as well, and special casing it by referring back to the concrete source in ocamlformat will just be duplicate work done in a not very reusable or principled way.

@jberdine
Copy link
Collaborator

Re deciding between heavy vs light syntax, I agree that ideally something other than print-parse-compare would be done. But there might be too much ambiguity in the odoc grammar. Previously there were problems e.g. with a list using - for items where one item included a - character that, with the right choice of margin, would end up at the beginning of a line and then get misinterpreted as starting an item. Another approach would be welcome, I'm just raising awareness of potential difficulties.

@Julow
Copy link
Collaborator Author

Julow commented Jan 28, 2019

I agree that modifying odoc may be easier and more robust. Odoc's AST could include whitespace strings. Same for the shape of lists, so that the user can deal with choosing the syntax if things goes wrong.

@jberdine
Copy link
Collaborator

Yes, but note that for the most part we do not want to "preserve" whitespace (or other non-semantic information), normalizing it is one of the objectives of auto-formatting it. Preserving such information ought to be a last resort, when faced with concrete examples that cannot be adjusted to be reasonably handled otherwise. I think that the first goal/version should not try to preserve such things.

@aantron
Copy link

aantron commented Jan 28, 2019

Modifying the odoc AST to preserve whitespace strings seems reasonable. I opened ocaml/odoc#285 about it.

In general, you're welcome to open issues on odoc about making it more useful, with a brief (1-sentence) description of what motivates each change :) I see a few more in the issue above.

@gpetiot
Copy link
Collaborator

gpetiot commented Aug 5, 2019

Closed with #721

@gpetiot gpetiot closed this as completed Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants