New base markdown parser #324

pngwn · 2021-10-16T13:15:30Z

As per this discussion mdsvex v1 will have a custom flavour of markdown that aims to be simpler and come with a few additional features that are pretty much expected at this point. This is a tracking issue for this work.

This does not include the svelte parser, the aim is to keep the svelte and markdown parsers separate and compose them in another step. This may prove to be too challenging due to the fact that the svelte syntax can appear almost anywhere in a mdsvex file but I'll see how that pans out.

The parsing strategy listed in the markdown spec is not really appropriate in this instance due to the fact that HTML (and Svelte) syntax will be parsed into a full AST and does not match all markdown semantics.

Broadly speaking there are a few different 'contexts' that you can be in when parsing a markdown file.

document, can contain leaf_block or container_block.
leaf_block cannot contain other blocks but can contain inline.
container_block can contain leaf_block, or additional container_block, container block is recursive.
inline cannot contain blocks but some inline nodes can nest other inline nodes.

 → Document →
   → Leaf Block ↑
     → Inline ⟳
   → Container Block ⟳
     → Leaf Block ↑
     → Container Block ⟳
       → ...

Specific nodes have additional rules on top of these very basic ones.

Probably stuff I've missed here, will be updated as we go.

Leaf blocks

Inline

Container Blocks

Bugs requiring tests to be closed by this:

The text was updated successfully, but these errors were encountered:

boehs · 2021-10-19T02:52:50Z

What is the plan for the actual parser layout? I think the dream for me is instead of hard coded parsing like most parsers somehow each inline, leaf and container are "registered", like plugins. This makes it amazingly easy to add new ones

I don't know how this would look but basically an abundance of modularity

pngwn · 2021-10-19T12:48:42Z

The parser design is more in the opposite direction, custom syntax extension won't be allowed. This is partly for performance and partly for simplicity. Markdown is bad enough on its own (although this flavour is simplified), add html + svelte into the mix and it gets even more complex, this is before there are any syntax extensions. While it would easy to register them, every time you add more syntax you run increase the chance of collisions.

If there is anything compelling then it could be added to the core as a new node, but I think most cases can be handled via the nodes that are already there using custom transforms + compilers. This is just the parser and will output an AST (will be specced and versioned). The process will broadly speaking be:

parse(src) -> AST
transform(ast, plugins) -> AST
compile(ast, handlers) -> SvelteComponent/ HTML/ whatever the custom handlers render

Most mdsvex features will be handled via plugins (there may be one or two exceptions).

Is there any specific functionality you had in mind when thinking about modularity.

This is just for the markdown parser. There will be 3 parsers: markdown, svelte, mdsvex (svelte + markdown).

boehs · 2021-10-19T16:56:24Z

my problem with markdown parsers is

ideally each syntax piece has its own file

pngwn · 2021-10-19T16:59:37Z

Why does that matter? As long as the maintainers are comfortable maintaining it then it isn't an issue.

I also have a preference for large files over many files. It has no bearing on the API and users won't know the difference, but this will be no different, most things will be inlined for performance reasons.

Chaostheorie · 2022-06-02T07:49:14Z

Just my 2cts but what stands in the way of a generated parser (like lezer) for generating the AST from the markdown/svelte source code?
While the grammar may end up being complex it should be a good middle ground between a modular and a full self-written approach. It may also allow you (given the generator is popular) to use an existing markdown grammar and just extend the svelte parts as seen fit.

Edit: It seems I underestimated markdowns parsability. Though the lezer implementation may still be a good starting point

pngwn moved this to In Progress in mdsvex Oct 16, 2021

pngwn added this to the 1.0 milestone Oct 16, 2021

pngwn moved this from In Progress to Todo in mdsvex Jan 19, 2022

rmunn mentioned this issue Mar 3, 2022

Code block demo/sandbox functionality? #432

Open

pngwn added this to mdsvex Feb 22, 2024

pngwn changed the title ~~New markdown parser~~ New base markdown parser Feb 22, 2024

pngwn mentioned this issue Feb 23, 2024

parse svelte constructs #568

Closed

27 tasks

pngwn moved this to Todo in mdsvex Feb 23, 2024

pngwn closed this as completed Aug 17, 2024

github-project-automation bot moved this from Todo to Done in mdsvex Aug 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New base markdown parser #324

New base markdown parser #324

pngwn commented Oct 16, 2021 •

edited

Loading

boehs commented Oct 19, 2021

pngwn commented Oct 19, 2021 •

edited

Loading

boehs commented Oct 19, 2021

pngwn commented Oct 19, 2021

Chaostheorie commented Jun 2, 2022 •

edited

Loading

New base markdown parser #324

New base markdown parser #324

Comments

pngwn commented Oct 16, 2021 • edited Loading

boehs commented Oct 19, 2021

pngwn commented Oct 19, 2021 • edited Loading

boehs commented Oct 19, 2021

pngwn commented Oct 19, 2021

Chaostheorie commented Jun 2, 2022 • edited Loading

pngwn commented Oct 16, 2021 •

edited

Loading

pngwn commented Oct 19, 2021 •

edited

Loading

Chaostheorie commented Jun 2, 2022 •

edited

Loading