-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add generic syntax for extensions/plugins #99
Comments
I love how the content is specified inside codespans and codeblocks; so even if the syntax is not supported, you still get a nice codeblock as fallback:
I'd like to suggest something: for the directive syntax, could make it use the fenced code block syntax? Something like:
This would be trivial to parse, and again if the plugin is not supported you get a simple codeblock:
Side note: It'd actually be a good idea to have math in its own extension, between |
Yeah, fenced code blocks could be used for block-level directives, but kinda feels like an abuse. And maybe not as nice as the reStructuredText syntax for directives that are not as "content-oriented" as code blocks. See for example this source documentation using autodoc: https://raw.githubusercontent.com/mitsuhiko/flask/master/docs/api.rst Yes, math might be worth it’s own syntax. But rendering math is non-obvious, and so maybe out of scope for Hoedown? |
Well frequently requested extensions (as long as they remained maintained) are never out of scope for Hoedown. |
To clarify the proposed directive syntax: both header and content are optional.
|
Just like fenced code blocks. The callback can highlight the code, or leave that to the client or postprocessor. The HTML renderer does the latter, but it can be overriden. In the case of math, the most appropiate for HTML would be to output it without changes, so that MathJax can display it. |
You're right. So, if we use this syntax, could we enforce a blank line between the directive header and the content? i.e. this would be invalid:
And should be written as:
That way, we get a nice codeblock if the extension's not present. |
I don’t think it is necessary to have the syntax be compatible a Markdown parser that doesn’t know about directives. The Hoedown parser would always be the same, it’s only different renderers that would do different things with directives. The built-in one could render them the same as code blocks. |
I think it'd make a good effect. For instance, a Roles are already compatible, and directives only need a blank line before the content (which you always use on that file you've linked). |
You can always use a blank line if you want, I don’t know if it has to be required. reStructuredText separates the indented text into parameters (before the first blank line) and content (after): .. autoclass:: Request
:members:
.. attribute:: form
A :class:`~werkzeug.datastructures.MultiDict` with the parsed form data from `POST`
or `PUT` requests. Please keep in mind that file uploads will not
end up here, but instead in the :attr:`files` attribute. But I don’t think we need that at the Hoedown parser level. Specific can always parse the content with their own syntax. |
So, rst requires a blank line to be present to separate the content from {directive, header, parameters}? |
I don’t know, the general documentation for directive syntax doesn’t mention it, so it might be specific to some directives: http://docutils.sourceforge.net/docs/ref/rst/directives.html
|
Hmm, I understand... |
But there's something I still don't understand: how should the parser treat the content inside directives? |
It should pass the raw (dedented) source to the renderer callback. Additionally, the callback should have the ability to parse that content as Markdown (including nested directives), but it doesn’t have to. |
Well, that ability would require exposing the nuts and guts of the parser (aka |
That, or the renderer should itself keep another |
There are apparently multiple possible approaches to parsing nested things for a renderer callback, but I don’t know enough about Hoedown’s internals to have an opinion on which is preferable. |
?? AFAIK no nested parsing can / is currently done at the renderer. |
Yes, that’s what I mean. Support for this would need to be added somehow. |
And that would mean exposing internal parser methods publicly. I'm not sure about that. |
I don’t know how this would be achieved. |
If contents inside a directive block need to be parsed, we would need a parsing function to override, not a rendering one—and then you’ll have issues like #94, and the following would need to be answered:
|
Yes, in the sense that you need access to Hoedown's internals.
You're not really adding new syntaxes to the parser, you have a fixed, generic syntax and you're parsing the content inside that syntax -- be it Markdown or not. Pretty much like you'd parse a codeblock's text to highlight it. So no, technically you don't need to extend the parser to make it parse roles, you need to override the renderer to parse your custom roles content. Not the best place to do it, but...
I think that would be enough. If closer integration is required, an extension is the right way to do it.
They are internals. They aren't expected to be called by the user, directly. Doing that otherwise than in specific cases is error prone unless user knows how Hoedown works internally. Yes, they are modular and IMO are ready to be exposed, but since they're internals, the user must know how to use them.
Really agree on that. And I'm afraid we'd be killing the simplicty of Hoedown's API in the process. |
Specifically, calling a Hoedown internal is only acceptable if:
If these conditions apply, calling Calling a hoedown internal in any other case will result in undefined behaviour. (wrong output in the best case, crashing in the worst case) |
I've found a problem while trying to implement this. Therefore things like that: void rndr_directive(hoedown_buffer *ob, const hoedown_buffer *name, const hoedown_buffer *header, const hoedown_buffer *text, void *opaque) {
BUFPUTSL(ob, "<div class=\"directive-");
hoedown_buffer_put(ob, name->data, name->size);
BUFPUTSL(ob, "\">");
parse_block(ob, document, text->data, text->size);
BUFPUTSL(ob, "</div>");
} Would fail since you cannot modify |
Why does the parser need to mutate the input data? If there’s no good reason for it, I propose just change all parsing functions to take hoedown_buffer work = { NULL, 0, 0, 0, NULL, NULL, NULL };
work.data = data + some_offset;
work.size = some_size;
doc->md.some_rendering_callback(ob, &work, doc->md.opaque); which can be easily refactored into hoedown_buffer *work = hoedown_buffer_new(64);
hoedown_buffer_put(work, data + some_offset, some_size);
doc->md.some_rendering_callback(ob, work, doc->md.opaque);
hoedown_buffer_free(work); And I think this makes the code more readable. I would work on this this weekend if there’re no issues. Shouldn’t be too much trouble. |
I completely agree. Just spent an hour changing all parsers, triggers and prefix checkers to accept It seems this was an optimization for performance (instead of copying, just move the data in place) but I think it's unacceptable. This will be changed. Everything else is fine.
This would be fixed with a cast. We know the callback won't modify the data, since it's passed a Doing |
But wouldn’t you still be able to modify contents of void (*hrule)(hoedown_buffer *ob, void *opaque)
{
ob->data[5] = '\0';
// ...
} I mean, obviously the bundled HTML renderer doesn’t do this, and custom renderers shouldn’t do it. But the point is that Another way to ensure data safety is to specifically declare an immutable buffer type: struct hoedown_buffer {
const uint8_t *data;
size_t size;
size_t asize;
size_t unit;
hoedown_realloc_callback data_realloc;
hoedown_free_callback data_free;
hoedown_free_callback buffer_free;
}; and use it instead of |
Yes, I know you can modify the data pointed to by the buffer, but if you use any of Even if you don't pay attention to the warnings (or don't enable them), the assertion for |
This might be relevant: http://talk.commonmark.org/t/generic-directives-plugins-syntax/444/39 |
The syntax is the origin message here was only a proposal. If the CommonMark folks settle on some syntax, it’d be preferable to align with that. |
Hmm... I think we can leave this aside for a bit, while we center on finishing v3 and adopting CommonMark for v4. |
Rust uses Hoedown for its documentation. We would like to add a bunch of new features (cross-references, math, …) that would need to be represented somehow in Markdown source. Rather than inventing new punctuation-based syntax for each of them, I believe (as discussed here) that we should only have one additional inline-level syntax and one block-level that each include the name of the "plugin". The built-in HTML renderer would not implement any plugin, it’s up to rustdoc and other users to do that in a custom renderer.
So, proposal. I’m not attached to the syntax or the naming, feel free to change them as you like. I just picked something similar to reStructuredText.
New inline (span) syntax for a "role":
:role-name:
content. Like in code spans, backticks can be repeated and whitespace is trimmed: `:foo:
` ``` (the content here is a single backtick.)New block-level syntax for a "directive":
Renderers would have two new additional callbacks.
The default behavior when the callback is NULL or returns NULL/0 would be as usual: directives are skipped and roles are printed verbatim. The built-in HTML renderer would not provide these callbacks.
Does this sound like a good idea?
The text was updated successfully, but these errors were encountered: