-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0072] Switch to CommonMark for documentation #72
Conversation
This RFC proposes to convert the documentation for the Nix, Nixpkgs and NixOS projects from Docbook to CommonMark Markdown. It proposes requirements for web facing documentation. It does not mandate a choice of toolchain to generate a website with documentation. The choice is left to the implementers and maintainers.
rfcs/0000-commonmark-docs.md
Outdated
1. good quality documentation search engine, | ||
1. syntax highlighting of all code, | ||
1. separate page per chapter, instead of the current monolithic page | ||
for each manual, | ||
1. table of content for each chapter available in a side bar to easily | ||
jump through long content, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The “good quality search engine” part becomes less of a problem when you put everything on one page, because the browser’s search bar is a tool that works everywhere, without internet connection.
Do we have too much documentation to fit everything on one page? Or what is the reasoning behind splitting it up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. The reasoning for the split-up is that this is what other projects with extensive documentation do as well. Presumably to make the documentation less intimidating (imagine coming to a project for the first time and realizing you're going to need to wade through 40k+ words of documentation...), but I'm not a tech writer so I'm not sure whether there are strong reasons.
Navigating from section to section currently isn't easy. But that can be solved with a sidebar TOC whether we're in single-page or multi-page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that the wall of text that are the current manuals is intimidating for first-time users. A collapsible TOC on the side would greatly help, plus it also degrades gracefully to the current experience when one has no JS.
I do think a one-page-man is more annoying (and resource-intentive) on mobile devices, though.
Emacsy projects usually have both split and unified manpages; that could be another option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our single page manual is no longer indexed by Google.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasoning of split-up is that google will rank one-page SEO farms quite low.
Additionally, URLs have to be search engine friendly so that random first-time Nix contributor blog is not ranked higher than the official documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which instances are the manuals not shown?
I assume if you search for content that exists in the manual, the results won't be found by google (or rated very low).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The “good quality search engine” part becomes less of a problem when you put everything on one page
I disagree. When I was new to Nix(OS) and I didn't know the terminology too well, I often had trouble finding things in the manual because I used too general terms (e.g. "build derivation") which resulted in a lot of unrelated results.
Also, on a search engine you can just combine terms that are relevant for the search, when performing a content-search in the manual using a browser you can't search e.g. for something like "nix rust derivation".
Don't get me wrong though, I really love the offline manual (including the PDF variants). Ideally, I'd love to see a split-up documentation on nixos.org and keep a single-page manual available offline (e.g. via nixos-help
as it is currently the case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which instances are the manuals not shown?
I assume if you search for content that exists in the manual, the results won't be found by google (or rated very low).
I googled "declarative package management" and the nixpkgs manual is the 3rd result.
@grahamc @domenkozar is this a case of the ranking being too low? It seems like the page is definitely indexed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem with the one-page manual is that Google won't do proper deeplinking. Search for makeWrapper
for example and it will send you to https://nixos.org/nixpkgs/manual/, no anchor applied.
This is less of a problem, when the pages itself are shorter and are more on topic for the search request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manual pages are also becoming more and more difficult to render on lower powered devices. It seems that throwing text on the screen using modern browsers is a hard task.
This, thus, reduces the accessibility of the manuals from users that don't have access to top tier machines like I figure a good chunk of Nixpkgs developers have.
While this is becoming less and less true in the developed world, bad internet that slows to a crawl is still a reality that exists. Splitting in distinct pages will also help accessibility in these instances.
rfcs/0000-commonmark-docs.md
Outdated
hegemony of RST in Python, because not all their users are Python | ||
programmers. They prefer not to count on writers learning Docbook or | ||
Asciidoc, because many documentation patches are from casual | ||
contributors. The *lingua franca* across subcommunities for both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly where the marketing and reality of Markdown differs. There are many standards of Markdown and most tooling implements their own format extensions, meaning you're not using Markdown but a flavor.
For that reason it's important to talk about the tooling used, as each documentation tool will use a different flavor and thus define the "vendor lock-in".
If not, does the RFC propose to ban extensions of the format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why the title of this RFC has CommonMark in it, not Markdown. CommonMark is a well specified dialect of Markdown that is also a de facto standard (understood by GitHub, Sphinx, Gatsby, Pandoc and many others). We propose to use CommonMark for now: nothing added, nothing less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's quite nice in theory, but in practice the need for extensions quickly arises. My bet is we'll use one even as we port the existing documentation.
For example there are no footnotes, tables in CommonMark, what do you propose to do with the existing ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOC is another example of such feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only looking at the rust book, it's not just CommonMark as it uses extensions for includes: https://github.com/rust-lang/book/blob/master/src/ch18-03-pattern-syntax.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only looking at the rust book, it's not just CommonMark as it uses extensions for includes:
That's true. They use a preprocessor on top of Markdown. So do the projects I mention who use Hugo or Jekyll as their static site generator. These toolchains add special macros for includes. The Gatsby and React projects don't use that, and I believe we can get by just fine without a preprocessor given the current content.
Macros for textual includes could conceivably become useful down the road, but it will have little impact on users, who can still just write the CommonMark they know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOC is another example of such feature.
TOC's are a must in my book. I think we're in alignment there. Whether they need to be programmable from within the document markup is a different matter. The ReactJS documentation (and our demos) have TOC's auto-generated from the document structure. They are prominently displayed in the sidebar. Gatsby's documentation also has that. Although this is one other place where they use an MDX component: they go the extra mile and generate section specific "site maps" at the bottom of a few of their documentation pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CommonMark allows us to get most of the documentation writing done without many issues. In case one does touch upon its limits, things get harder, and one has to fallback to another language like e.g. HTML.
In my opinion this RFC should define what we fall back to. E.g., fallback to HTML, or, fallback to say reStructuredText? See e.g. how one can create a table with Sphinx + CommonMark:
```eval_rst
+------------+------------+-----------+
| Header 1 | Header 2 | Header 3 |
+============+============+===========+
| body row 1 | column 2 | column 3 |
+------------+------------+-----------+
| body row 2 | Cells may span columns.|
+------------+------------+-----------+
```
Clearly this now becomes toolchain-dependent. In this case falling back to reStructuredText instead of HTML still allows the creation of pdf's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As soon as you start using HTML, you lose portability of any kind (which may not be a requirement, but then we need to pick the tooling and commit to it).
That's true, but in practice:
- HTML is seldom needed except for a few key things (tables and callouts),
- different parts of the documentation are written for different media types.
The man pages are written for man
. Tables in man pages are not a must have (and even if they were they are still supported by a number of man generation tools even if they were, like go-md2man, Pandoc and possibly Sphinx). Some users may be interested in PDF output rather than HTML output, but that's a matter of having decent CSS for the print
media type. Likewise, I don't think EPUB support should be the deciding factor either, and we likely don't want to support that officially. But for those users who really want it Pandoc understands CommonMark (including HTML tables) as an input very well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearly this now becomes toolchain-dependent. In this case falling back to reStructuredText instead of HTML still allows the creation of pdf's.
@FRidh what you're saying is that there exist corner cases, but those corner cases don't mean we're SOL - just that what happens then is not specified by any spec. I agree. AFAICS there are two or more tool choices available for all such corner cases (in particular tables) and all media types (in particular man pages). So long as this holds true, we're in a good place: it means we don't need to commit as part of this process to a specific implementation - just agree on the requirements for any such implementation, safe in the knowledge that satisfying the requirements won't be onerous.
There was enough bikeshedding in #64 that I'd rather make progress by agreeing on the format and the requirements for any toolchain, so that implementers don't have to spend heaps of time working towards a dead end because the community can't agree on the format (or indeed the requirements). (cc @domenkozar)
rfcs/0000-commonmark-docs.md
Outdated
format, which in any case *can* be extended with the wise use of | ||
`<span>`-like HTML tags and custom tags that expand to HTML. That | ||
appears to be seldom necessary in practice (see e.g. the five | ||
documentation examples in [Motivation][motivation]). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drawback: no support for PDF, epub or manual pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need PDF or epub. Manpages are apparently supported by Sphinx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the tooling matters here again, otherwise we will have to ditch services.nixosManual.showManual
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? That option only requires an HTML rendering of the manual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I forgot that uses HTML instead of the manual page.
Co-authored-by: Profpatsch <[email protected]>
Co-authored-by: Frederik Rietdijk <[email protected]>
I worry a little (both in #64 and here) that the natural order to make these decisions directs a lot of attention away from something that seems like one of the most-useful razors: Across the ecosystem, what high-ROI tasks/work would people like to automate that will require access to some information in/from/about the docs? Will any given decision enable or impede doing so? |
Co-authored-by: davidak <[email protected]>
Co-authored-by: davidak <[email protected]>
Is the form of the output in scope for this RFC? I am talking about the design, looks-wise, and also some of the function, which could be a magnet for bikeshedding. If it isn't, it should be clearly stating it to prevent bikeshedding. I also propose that it states how the actual final form of the docs will be selected. (Another RFC? Collaboration between website team and docs team? Other proposal?) If it is part of the discussion for this RFC, I think it will needlessly bog it down. Now, the form of the output only lightly depends on the selected toolchain, in my experience. This RFC could (should?) still decide on a specific toolchain without ruling on the final looks and behaviour of the output. I see that in Future work there is
So I believe I am understanding right that the output looks and behaviour is not in scope of this RFC. |
In #64 we deemed some things as essential for a documentation format, one point of which was the ability to create references and being able to link to them from anywhere. This is very useful for option definitions being referenced in the manual, or by other options. Also very useful for link anchors. As far as I can see, the CommonMark standard does not have any such thing. There's link references, but those don't work outside of the current document. So, if this RFC were accepted, we would not be able to ever have inter-document references. Is that correct? If the answer is "Yes, but we can use this extension in $toolchain which can do this", then I reply with the fact that this then isn't CommonMark anymore, and we're then tied to $toolchain. So unless we can be sure that CommonMark has absolutely everything we will ever need, this RFC should instead be about choosing the tooling that can provide everything we need. You also mention that
How many of these projects are actually using spec-compliant CommonMark? Since you mentioned Markdown in that sentence, this let's me believe that most projects aren't using CommonMark, but rather CommonMark/Markdown + tooling-specific extensions, suggesting that pure CommonMark wasn't good enough for all of these projects. So I really think we should include tooling in this decision, with all the extensions it provides that we might want to rely on. |
I have to agree with @infinisil that the lack of proper inter-document cross-references is a deal-breaker. I have also never worked on a project that used pure Markdown, there were always extensions because Markdown is just too minimal for any serious documentation project. |
Design: no. Function: yes. Quoting from the RFC:
In the "Detailed design" section, we list a number of requirements, which all have to do with what you call function above (but not design). While I really really want to avoid discussing implementation to avoid further bikeshedding, we should make sure that the choice of format doesn't paint us into a corner where important functionality would be impossible or hard to achieve. I listed a few requirements, but I may have overlooked some other important ones. I believe that implementation should not be discussed as part of this RFC, for the following reasons:
Fair point. This is currently left implicit in the RFC as: whatever is the usual process for getting a change landed on the website (create an issue, open a PR, get it reviewed, merge). A collaboration between website and docs team (with external contributions) sounds like the right approach to me. |
Agreeing with the above I propose we extend the RFC stating we use Sphinx as tool, using the recommonmark plugin as CommonMark backend.
(The gatsby mockup looks way nicer though, but I don't see any reason we could not achieve the same with Sphinx) |
Folks say that there is no such thing as plain CommonMark, but fall short of stating what additional markup they believe is absolutely essential. Furthermore, it would be good to back up any such claim with evidence. The RFC does include evidence to the contrary. As highlighted again in a previous comment, all 5 large projects cited in the RFC use plain CommonMark for markup except Kubernetes that uses one extension for tables and one for definition lists. Gatsby uses MDX to include newsletter signup forms, embed videos from third-party training websites, and build partial documentation sitemaps. Tools like Hugo, Jekyll and mdBook provide macros for textual file inclusion to what is regular CommonMarkdown. Lastly, while this RFC does not propose to do so, what exactly would be so bad if we end up using one or two extensions here and there? Some extensions, like tables and definition lists, are already well supported by many documentation processing tools, as well as translation tools like Pandoc, and even if that weren't the case, translating the odd table or definition list by hand should we move to NewMarkupLangPro in the future takes far less time than this RFC discussion will likely take. We could amend this RFC to require non-HTML tables. That would still not force a single implementation. Tables are supported by all of Gatsby, Hugo, Jekyll and mdBook, the toolchains used by the 5 projects cited in the RFC. Same goes for footnotes. |
@FRidh thanks for pointing out that Sphinx supports these too, via |
I asked myself the same question and would love to know a good answer, but such an investigation ended up taking far too long. Focusing on just the 5 projects I cite in the RFC that are both in the Top 100 and also large and diverse projects (quite apart from their popularity), it appears that some (like React) use no extensions and no textual inclusion macros, Kubernetes uses 2 extensions (tables, definition lists) and macros, VSCode uses the table syntax from GFM but is otherwise plain CommonMark, and Gatsby uses a handful of custom MDX components, mostly to embed content. |
I'm not sure what requirements you have in mind (if you write them out I can include them in the RFC), but what happens in practice is that cross-references are done with (root-)relative links and anchors. An example is the Gatsby project, which has extensive cross-referencing. Look at e.g. this page: https://www.gatsbyjs.org/docs/write-pages/. The sources (accessible using a button at the bottom) use a lot of links with anchors to entities defined elsewhere. It's up to the toolchain to check for dead links. |
Since we did a few rounds of this documentation discussion already, we all know that the set that includes everybody's preferences is empty. At the same time, I think that we can all agree that any of the proposed formats are better than DocBook. In order to move forward, it will require some of you to let go of some of the technical details that are dear to you. I support and would like to nominate myself as a shepherd if possible. |
I think that we can all agree that any of the proposed formats are better than DocBook.
Somehow I do not get the same impression from the discussion. It looks more like some of the people who wrote the most text for the manual have doubts whether any of the other formats are good enough, but consider Docbook acceptable. They might be willing to see whether the marketing team actually can organise a significant improvement of documentation if that specific thing changes, but this is not the same as agreeing that some format is definitely better.
(As for me, last time I tried to write a bit of documentation, I failed to understand where in the current documentation it should even fit and gave up; I do not have any real opinion on these formats, although of course I like the clear escaping rules brought with XML, and dislike some of the error messages about wrong element nesting)
|
Interesting, it is my impression that we generally agreed that, while technically all of the proposed formats are a step backwards, it is worth it for the overall project. And that at that point it also doesn't matter much which language/tool we pick considering the status quo. It is therefore also my opinion that we should not investigate much further and just go ahead with any language/toolchain combo. Indeed CommonMark is a good fit, and as pointed out by @mboes there are different tools/extensions available for the other IMO important features such as tables and references. Then let's pick one and move on.
This is a far more interesting discussion that we ought to spend our time on, contents! Let's be pragmatic here. Thus, to continue my previous post, would anyone be opposed to the Sphinx + CommonMark combo? If so, why? |
Well, the problem is CommonMark spec does not have the notion of anchors. Some implementations add anchors to headings automatically, but implicit heading anchors
So the requirement would be to be able to specify explicit anchors, both to sections (through headings) and to other types of content (paragraphs, list items, tables, code listings…). There appears to be a CommonMark extension for that but, again, compostability is an issue. MDX might fare better. |
I think the syntax of Sphinx is not very user friendly. (but haven't used in a long time) So combining it with CommonMark seems not the best fit. That it is not popular in the top100 may be due to that. So i would rather use one of the never, popular tools. Hugo is extremely popular in the space of static site generators and is actively developed. We can also use it for the website. We would be very flexible with it's huge ecosystem. (It supports interlinks https://gohugo.io/content-management/cross-references/ but has no native PDF export) (haven't used Gatsby, Jekyll or mdBook) Mixing RST with Markdown might be confusing since they are very similar. But i guess Sphinx is better suited for documentation and "Nix domain for describing functions" sounds great! So i would not opposed. |
To throw another possibility into the mix I'm very intrigued by: MyST
|
The FCP is over. I'm not aware of any objections. |
My only concern here is: Is there an Emacs mode for it? :) |
There is no link to the source for the example https://nixpkgs-manual-sphinx-markedown-example.netlify.app. |
Merging. Thanks all for your participation! |
Since Nix uses mdbook now for documentation generation is this the plan for the other manuals as well? |
We made the decision during 20.09 management that the release process shouldn't be in the nixos manual. There should still be a section "NixOS Releases", but information that's useful to the user. Having the process there was really weird. It has since been split out to https://github.com/NixOS/release-wiki and I used mdbook with GitHub actions and pages. |
It's a possibility. As part of next steps, we need to get the different interested parties together and talking about the tooling. Part of interested parties are:
As previously stated, this is only about the input format; nothing the tooling. |
I found a link in a comment. Do read the comment.
So the example in this RFC was rst, not md... |
Could someone with admin powers (or @mboes) fix the link in the PR description? :) |
As far as I could discover mdbook lacks the possibility for the amount of nesting that is needed for Nixpkgs, which manual is significantly larger than the Nix one. |
Nesting (or folding as mdbook calls it) is configurable with |
Thanks for the link, I couldn't find it myself before. But it seems like it's related to the way the content is presented and it doesn't seem like there is or there isn't a restriction on the amount of levels of sections. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/markdown-vs-asciidoctor/15583/3 |
Here's a migration plan for the module system |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/ideas-to-make-it-easier-to-contribute-to-the-documentation/20312/2 |
* Switch to CommonMark for documentation This RFC proposes to convert the documentation for the Nix, Nixpkgs and NixOS projects from Docbook to CommonMark Markdown. It proposes requirements for web facing documentation. It does not mandate a choice of toolchain to generate a website with documentation. The choice is left to the implementers and maintainers. * Update rfcs/0000-commonmark-docs.md Co-authored-by: Profpatsch <[email protected]> * Update rfcs/0000-commonmark-docs.md Co-authored-by: Frederik Rietdijk <[email protected]> * Update rfcs/0000-commonmark-docs.md Co-authored-by: davidak <[email protected]> * Update rfcs/0000-commonmark-docs.md Co-authored-by: davidak <[email protected]> * Update rfcs/0000-commonmark-docs.md Co-authored-by: Ryan Mulligan <[email protected]> * Add shepherd metadata * Add shepherd metadata * Allow CommonMark plus small number of extensions The exact set of extensions used is left to the documentation team. * Final adjustments - Nix now already uses markdown - Fix grammar mistake - Remove requirement of extensions having to be supported by at least 3 toolchains - Add requirement for an extension for references - Rename the file to its designated name Co-authored-by: Profpatsch <[email protected]> Co-authored-by: Frederik Rietdijk <[email protected]> Co-authored-by: davidak <[email protected]> Co-authored-by: zimbatm <[email protected]> Co-authored-by: Ryan Mulligan <[email protected]> Co-authored-by: Linus Heckemann <[email protected]> Co-authored-by: Silvan Mosberger <[email protected]>
This RFC proposes to convert the documentation for the Nix, Nixpkgs and NixOS projects from Docbook to CommonMark Markdown. It proposes requirements for the appearance of web facing documentation generated from Markdown. It does not mandate a choice of toolchain to generate a website with documentation. The choice is left to the implementers and maintainers.
rendered
This change is