-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to CommonMark #1371
Comments
We've been planning to do this for some time, but hoping for an official extension syntax from CommonMark as the impetus for doing so, rather than the de facto extensions where everybody defines their own markdown dialect, like markdown-it. Right now, markdown-it can either implement CommonMark or have extensions, but the two are mutually exclusive. |
The existence of the CommonMark spec doesn't preclude extensions. Currently, the CommonMark people are working on finalizing the spec for the "basic" stuff and they don't have time for thinking about extensions, but that doesn't mean that the spec forbids extensions. After the "basic" CommonMark spec is completed (it is unclear to me when this is going to happen ... beginning of 2016?), there might be some work on "official" extensions (or not). If an "official" math extension is made, I'm quite sure it will be the same syntax as we are already using and as pandoc is using since a very long time ago (and probably many other implementations, too). Long story short: we don't have to feel dirty just because we use a (not yet and probably never "officially authorized") extension to CommonMark. About an "official extension syntax": Do you mean the proposed block syntax with colons? Something like:
Sure, such a thing might be officially supported in some time (in a few years), but I urge you not to force Jupyter Notebook users to use this syntax for block equations. This would be a huge step backwards for the ease-of-use of the Notebook (not to mention it would break all existing notebooks that use block equations). Coming back to markdown-it: they claim "100% CommonMark support", the ease of creating extensions doesn't change anything about that. Just because it's harder to make extensions for commonmark.js doesn't mean one of them is more or less CommonMark compliant. But I'm not saying that markdown-it is the best, if you prefer we can also use the reference implementation commonmark.js (or something entirely different), but then we'll have to use the same hacky work-around as we are using now to get the math extension working. OTOH, it seems that commonmark.js is currently more active than markdown-it. I don't really care which library gets selected, I'm just suggesting to change to CommonMark without further delay (and of course keeping the math extension that's currently in use). |
The whole purpose of producing a common specification, though, is that Markdown can be interoperable, and not dependent on whatever ad-hoc extensions are implemented in particular libraries and not properly specified. So again, we should wait for CommonMark to specify some kind of syntax for extensions to use, analogous to roles and directives in rst. I know that's annoying, but if it increases the pressure on CommonMark to do that, everyone wins eventually. We will probably never force people to switch to the new extension syntax for equations, because that would break existing content. But everything else should go through one extension syntax, even if that's a bit ugly, rather than trying to pick special symbols for each thing to make it concise. |
I'm not suggesting to add new syntax! I know we were talking about new syntax in #1292, but that's a different issue! I'm "just" suggesting to switch to CommonMark. And the sooner we do that, the better. Deliberately delaying that doesn't help anyone. Do you not want to switch to CommonMark? |
In principle, I'm fine with switching to Commonmark, but I don't see much advantage to it at this point. As Min mentioned, if there was an official syntax for extension points, that would be a compelling argument to change. |
Thank you for raising this up @mgeier. I'm a fan of commonmark and have been adapting projects to use commonmark compatible parsers and renderers. In particular, commonmark.js or remark. In the notebook we've been using a markdown implementation that is bundled with "extensions". For instance, GitHub flavored markdown is supported in the notebook by marked. As in agreement as I am with switching to commonmark, there are several costs (the magnitude of which is unknown):
That's likely most of the hesitation to driving forward. People are in agreement that someday we should. It's the maintenance burden we perceive, relative to finishing other work on the project. If you start the investigation @mgeier, with a PR in earnest, we'll be better able to evaluate the switch. I do agree that making our own markdown cells adhere to a well specced format is paramount to saying that we have a well specced notebook format. Same goes for embedded LaTeX/MathJax. |
Thanks that we can now finally talk about the actual issue instead of bashing CommonMark extensions! I think most of the concerns raised above reduce to those two points:
Although the magnitude of both is unknown at this time (as @rgbkrk mentioned), I think it's possible to make some predictions:
For me, this is an absolutely compelling argument to switch to CommonMark as soon as possible! Are there any objections in general? |
The flip side is that when you change something that's going to break some things, it's much easier if there's a compelling advantage that comes with the changes. The 'eat your vegetables' approach - do this because it's the right thing to do, doesn't make us many friends. |
The main argument now is that if CommonMark extensions do happen and we adopt them, then we would be breaking notebooks twice instead of once. To be clear, adopting markdown-it + extensions is not adopting CommonMark, it is adopting yet another markdown dialect defined by a single implementation. That's not to say that I'm opposed to adopting markdown-it, but doing so should mean that we are giving up on the prospect of actual CommonMark extensions in the vaguely near future. Of course, if we stick with CommonMark-derived implementations (as we probably should), the domain of those incompatibilities ought to be confined to the areas affected by extensions, and not basic things like spaces in/around headings, like we've seen in the past. |
@takluyver Do you want to switch to CommonMark or not? I think switching to CommonMark is enough compelling advantage on its own, we don't need anything else to "sell" it. Also, most users won't even see a difference. @minrk wrote:
I don't get it. And of course there will be breakage if the CommonMark spec changes, which hopefully will happen extremely seldom. But really, all this won't be such a big deal. Only some extreme cases will break when switching to CommonMark. And I'm quite sure that if there will ever be a LaTeX-math extension, it will look very similar to what we are using now (i.e. what is used in pandoc).
TBH, I don't care what JavaScript library will be used, I don't have experience with any of them. I was just mentioning markdown-it because I found it by a quick web search and it seemed to be the only JavaScript library available that provides an API for syntax extensions (and many existing extensions using this API), all other libraries provide "only" an API for AST transformations and custom writers. But since the ugly hack for math blocks is already in place, that's probably not such a big problem. The only thing that may be problematic is the implementation of tables, which are not (and won't be) part of "core" CommonMark. markdown-it claims to follow the CommonMark spec, so if they are not lying, it would be adopting CommonMark. The existence or non-existence of extensions doesn't change anything about that.
As I said above, potential "official" extensions for LaTeX math and tables will very likely be very close to what we're using now. They will probably be less restrictive, but this wouldn't be a backwards-compatibility problem.
I'm not quite sure what you mean by that. |
@mgeier Historically, the Jupyter team has been very committed to providing installed users, even those that may be working with edge use cases, backward compatibility and minimal breakage when making changes to the code base. Breakage now, or in the future, is still breakage for the end user and their organization. As for the implementation costs, there are a number of variables (developer time, support for users experiencing breakage, resource availability within the larger project scope, etc.) that make it difficult to prove with a high degree of certainty that implementing changes now is preferable to waiting. Overall, I believe that @minrk and @takluyver may simply be saying that their preference is to err on the side of caution when it comes to impacting our end users. |
Just a little update: I haven't found an extensible CommonMark implementation for Python, which I would need for Until then, we can also wait with changing the JavaScript implementation. |
Another argument for switching from marked - the last commit to master was about a year ago. |
@mgeier Just an update that there is now an MIT-licensed "fast, extensible and spec-compliant Markdown parser in pure Python" called mistletoe |
Thanks for the hint, @gazzar! It indeed looks like it has an API for syntax extensions. Now I don't have an excuse anymore ... |
I don't know if this has been discussed yet, but I think it's high time to switch to CommonMark.
It looks like there are (at least) 2 JavaScript implementations:
The latter seems to allow syntax extensions, and there is even already a MathJax extension: https://github.com/classeur/markdown-it-mathjax
The text was updated successfully, but these errors were encountered: