Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable custom plugins #32

Open
lucperkins opened this issue Dec 5, 2022 · 9 comments
Open

Enable custom plugins #32

lucperkins opened this issue Dec 5, 2022 · 9 comments

Comments

@lucperkins
Copy link

I'm a huge fan of this library, especially the MDX support 🎉, but I'd like to be able to provide my own extensions (such as super-fancy code blocks with all the bells and whistles and support for fun things like Mermaid diagrams). As far as I can tell, there isn't currently a way to do that. If I'm wrong and there is a way to do that, please let me know and I'll close this 😄 But if this isn't currently possible, I'd like to know if you consider that a non-goal for the project. If it is a non-goal, I'm happy to close this and explore other options. But if you're open to that, I'm happy to help.

@wooorm
Copy link
Owner

wooorm commented Dec 5, 2022

Thanks!

Extensions for me (and as this extends micromark/mdast/remark/etc, I’d like to stick with those terms) mean extending the syntax of markdown with custom things. E.g., JSX, math with dollars, etc.
Plugins mean transforming the AST.

The first is as far as I understand impossible in Rust. The extensibility needed to support extensions is not there in Rust. None of the other rust markdown parsers that I know support them either.

The second can be supported. Here’s a PR on the MDX repo: wooorm/mdxjs-rs#8. Something like that can be made on top of this (hence its there).
Luckily your examples fall in this plugin category.

I want to develop such features based on needs of people that know the Rust ecosystem, instead of baking it in immediately. So I’d like to hear some ideas and have discussions first :)

@lucperkins lucperkins changed the title Enable custom extensions Enable custom plugins Dec 5, 2022
@lucperkins
Copy link
Author

lucperkins commented Dec 5, 2022

@wooorm Oh yes, I definitely mean plugins built on top of a provided AST 😄 I've updated the issue title to reflect that. Extensions are officially above my pay grade 🤣

I think it'd be nice to be able to specify plugins without too much extra plumbing. Maybe something like this:

let mut options = Options::default();
options.custom_plugins = vec![
    fancy_code_blocks,
    mermaid_diagrams,
];
let result = to_html_with_options("Big long fancy doc...", &options)?;

Or another option, of course, would be to see if #8 lands and I can just use mdxjs directly (or shamelessly copy that code). But in general I'd definitely prefer to be able to provide plugins for vanilla Markdown rather than via MDX.

comrak, for example, provides a system for modifying the AST and then converting the plugin-modified AST to HTML. And that's basically the mechanism I'd love to see in this lib.

@wooorm
Copy link
Owner

wooorm commented Dec 6, 2022

rather than via MDX

MDX also allows the vanilla markdown format. However, it still compiles to a string of JS, not a string of HTML.

that's basically the mechanism I'd love to see in this lib.

This project here currently is positioned a bit lower than that. You can get a string of HTML out directly (no ASTs), or you can get an AST (which you can then do whatever with yourself)

@digitalmoksha
Copy link

Extensions for me...mean extending the syntax of markdown with custom things. E.g., JSX, math with dollars, etc.

The first is as far as I understand impossible in Rust. The extensibility needed to support extensions is not there in Rust. None of the other rust markdown parsers that I know support them either.

I am totally new to Rust, so I could very well be wrong. But it seems like https://github.com/rlidwka/markdown-it.rs/tree/master/src/plugins supports adding extensions to the language, as opposed to just modifying the AST.

@lucperkins
Copy link
Author

@digitalmoksha Oh wow, this may just be perfect for my use case! Thanks!

@chriskrycho
Copy link

chriskrycho commented Apr 21, 2023

@wooorm I haven't played with trying to add this to markdown-rs yet (and don't know if/when I'll have time to), but a design that I find quite powerful is the one exposed by pulldown-cmark. That crate's public API for generating HTML expressly operates against the stream of syntax "events" it emits. The internals of how that works are extremely similar to the way that parser::parse(input, options) works in this library, but in pulldown-cmark, the event stream is exposed in the public API and is the input to its public APIs for generating HTML, and the events carry the data which makes up the node. As a result you can walk that iterator and produce a new collection of iterable Events from it. This is super useful if, for example, you want to do inline syntax highlighting using Syntect. You can do that kind of thing by transforming the stream of Events and then handing off the transformed stream to push_html(output_buffer, events_iterator)!

I recognize that one large different is that this crate's to_html does not operate on its AST: unlike pulldown-cmark, the Event types here are different from the AST. However, the result is that if someone wants to do that kind of thing, they have to (a) materialize the AST, as the normal to_html does not, and then (b) reimplement to_html on top of it. (Alternatively, only operate with the final, already-processed HTML, but that's a much worse path performance-wise than operating on a Markdown AST because it means you have to parse the HTML!)

I totally get why you don't have the to_html() function use the AST: Why pay for the cost of reifying the AST if you can just skip it and render HTML directly? But if you want to use markdown-rs as a library, not having that exposed makes it much less usable.


Caveat to all of this: I like this library's approach so much that I may end up implementing what I want using its mdast and doing just that—it's far more approachable than trying to implement "standard" footnotes in pulldown-cmark, which I have not managed to find a way to do without rewriting huge swaths of the lex-and-parse stuff (…which I really do not want to do)!

@wooorm
Copy link
Owner

wooorm commented Apr 22, 2023

I think the way to go about it is doing the same as what we do in JS:

  1. https://github.com/wooorm/mdxjs-rs/blob/e90ad3d49ba067f043f83c90e0e144c1f0493ae6/src/mdast_util_to_hast.rs#L82
  2. implement hast_util_to_html, like https://github.com/wooorm/mdxjs-rs/blob/e90ad3d49ba067f043f83c90e0e144c1f0493ae6/src/hast_util_to_swc.rs#L81, but instead a copy of https://github.com/syntax-tree/hast-util-to-html/blob/3c9469abbb1ddd576e93387e37434c0f4d1db6ef/lib/index.js#L24

And to have plugins operate on either mdast or on hast!

@chriskrycho
Copy link

That definitely seems reasonable as a design, though I will note that it also comes with non-trivial performance overhead (lots of extra copies and allocations for the AST→AST transforms)! It's quite possible that this is already in the "it's more than fast enough" bucket such that it doesn't especially matter, though.

@wooorm
Copy link
Owner

wooorm commented May 3, 2023

non-trivial performance overhead

This seems more strongly worded than what I’d think.

I mean, events are also objects, that are mapped. But they are terrible to work with. ASTs are great to work with.
There is one mapping that requires copies: markdown AST -> HTML AST.
But other than that, AST transforms do not need extra copies?

Having spent 10 years on ASTs for markdown in the JavaScript world, I’m pretty convinced that ASTs and plugins are the way to go about it.

See also wooorm/mdxjs-rs#27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants