Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native markdown overhaul #1481

Closed

Conversation

Omikhleia
Copy link
Member

@Omikhleia Omikhleia commented Jul 16, 2022

Closes #1336

A work-in-progress to at least get the existing "native markdown" route working decently.

First 3 commits are mainly to make it work again...

  1. Upgrade our local copy of lunamark to master as of July. 17, 2022
    Rationale: the old version was close to 200 locals in a function, meaning we could hardly modify it... And one bug or two were also fixed apparently)
  2. Hack lunamark so that it can output an AST again
    Rationale: that master version removed does not have the ast writer (from our earlier fork, see note below) and enforces text output. It's a one-liner to fix, and allows us to work with the AST directly again...
  3. Update our markdown class so it works minimally again...
    ... and even with a few improvements, e.g.
    • Links (to URL)
    • SVG, hrule, etc.
    • Lists since we now have them
  4. Actually split the markdown class to a markdown package, usable with other classes

Subsequent commits are additional features (Pandoc-like extensions) with corresponding changes in lunamark:

  • Strikethrough ~~deleted~~
  • Div and Span, with SILE then supporting:
    • Underlining [text]{.underline},
    • Small caps [text]{.smallcaps}
    • Language change, inline [text]{lang=fr} or as fenced colon block, ::: {lang=fr} ...)
  • Image attributes ![](image.png){ height=..., width=... } (keys passed through to SILE, so our units work there)
  • Raw inlines and row blocks, {=xxx} after a simple inline code (with backticks) or as infostring on a fenced block code (backticks or tildes), with SILE support for SILE-code and Lua, as {=sile}, {=sile-lua}
  • As an extra goodie, fenced block code can now also use an extended attribute infostring (i.e. besides the usual, say, python, one can use {.python key=value})
  • Subscripts and superscripts, e.g. respectively H~2~O, 2^10^
  • Task lists (from GitHub-Flavored markdown) as first-class citizens
  • Fancy lists (à la Pandoc)
  • Pipe tables
    • From a partial backport of wikito/markdown see Content slicing, table support, and miscellaneous fixes jgm/lunamark#31 = it's functional, but the lunamark documentation, appropriate credits and license clearing all need to be addressed.
    • Mapped to my "ptable" package = The branch currently has an "all-in-one" single-commit backport of it, so I could proceed further and test it. This will have to be removed when ptable eventually makes it...

Eventually, when judged stable enough the changes to lunamark shall be proposed to upstream to its repository. UPDATE: PR submitted.

At this points, I am taking a short break. Future tasks ought to be:

@alerque
Copy link
Member

alerque commented Jul 16, 2022

2. Hack lunamark so that it can output an AST again

From a real quick glance, it looks like the function we're overriding to hack this in is a publicly exposed one. Can we use lunamark as a regular dependency and just patch that function internally instead of having a vendored copy again?

@Omikhleia
Copy link
Member Author

Can we use lunamark as a regular dependency and just patch that function internally instead of having a vendored copy again?

We could. I'd want to propose the fixes and improvements I am preparing to lunamark, though, eventually, and see how the owner reacts. Another possibility would be to maintain our own fork of it (outside), if the owner is not interested.

@alerque
Copy link
Member

alerque commented Jul 17, 2022

Can we use lunamark as a regular dependency and just patch that function internally instead of having a vendored copy again?

We could. I'd want to propose the fixes and improvements I am preparing to lunamark, though, eventually, and see how the owner reacts. Another possibility would be to maintain our own fork of it (outside), if the owner is not interested.

If we can get away with just monkey patching a function after we load the upstream library lets start with that. Of course proposing the fixes upstream is fine too, but until such a time as they are accepted and released we still need a deploy path that works for us. Forking is a potential if the upstream was really unmaintained, but as long as we can't reasonably be the canonical fork, using a fork is problematic for packaging and distribution. At that point we would have to vendor it like it is now.

@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from 9e6e31c to 8fc7d4a Compare July 17, 2022 11:14
@Omikhleia
Copy link
Member Author

Actually I should have read the comments more thoroughfully, our markdown class stated in its initial comment

You will need my lunamark fork from https://github.com/simoncozens/lunamark for the AST writer.

So we already had kind of a fork. Oh well, let's proceed anyway, and sort the mess later 🌵 (but this note is added as a reminder)

@Omikhleia
Copy link
Member Author

Description updated, with a task list for subsequent steps.
I'm having a break on this, until things move on both the lunamark and SILE sides.

@Omikhleia
Copy link
Member Author

Omikhleia commented Jul 19, 2022

Lol ?

image

EDIT: So for some reason I would have to investigate, the broken markdown class now conflicts.

Am I enough annoyed?

@alerque
Copy link
Member

alerque commented Jul 19, 2022

I'll fix the merge conflicts, don't worry about those.

@alerque
Copy link
Member

alerque commented Jul 20, 2022

I just handled the merge conflicts for you. Note that since this brings in the new modular inputter setup in from master, the merge splits up what you had all in the markdown class into two places: a markdown inputter and the markdown class.

As you started to setup, using the markdown reader also sets the document class to be markdown by default. We could work on exactly how that works, but for now it seemed like a good place to start.

You can call this on a markdown file by requesting the inputer be loaded at run time:

$ sile -r inputter.markdown myfile.md

The output should go straight to PDF as usual.

Don't worry too much about the exact details of your commits if you keep working on this. I see some of these need some refactoring or are even temporary, but I don't mind doing the rebase work to get them cleaned up if needed.

@alerque
Copy link
Member

alerque commented Jul 20, 2022

I handled the merge conflicts locally, but for some reason I can't push them here. I don't think you enabled the "maintainers can push to this branch" option on this PR, perhaps you'd like to. In case not I stuck a copy of the branch with my merges on the end here and you could pull in my changes from there to your local branch.

@Omikhleia
Copy link
Member Author

I don't think you enabled the "maintainers can push to this branch"

I actually just disabled it on all my PRs ^^

As of this one (a draft), the mardown class will probably have to go away at some point, in favor of a package that may be used in any class (not necessarily deriving from the current book class). I am not yet there, though.

@alerque
Copy link
Member

alerque commented Jul 20, 2022

I actually just disabled it on all my PRs

However you like. The merge is pretty involved though so I do suggest you at least merge my PR and then move on from there.

the mardown class will probably have to go away at some point, in favor of a package that may be used in any class

That sounds like a reasonable refactor, and roughly what I already did with the pandoc package that handles SIL content generated by Pandoc itself (including span attributes, div blocks with classes, etc.) I've been using that in production a long time. I'm actually kind of curious what part of that output wasn't sufficient for your original use case, but it's not a big deal. I don't have an issue seeing more than one working solution.

One caveat: we'll probably want a class anyway to trigger the right package(s) to load, but it can be pretty slim and many of the content processing functions can move to a package.

I think it would also be possible to setup a class with a classoption that determined the parent: i.e. we could have a markdown class that optionally inherited from plain or book depending on a value passed in with -O. I could look into that if you think it would be useful.

@Omikhleia
Copy link
Member Author

Omikhleia commented Jul 23, 2022

N.B.

  • Started to squash some commits (notably those on our lunamark vendored copy, now that this is handled elsewhere too)
  • Ditched one of the subsequent steps (header attributes, labelrefs-based cross-references). Not mandatory to get something better than we previously had, and this is too problematic (needing book class support), so I am not going to address it here.

@Omikhleia Omikhleia changed the title Native markdown improvements Native markdown overhaul Jul 23, 2022
@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from 4d739be to cbe64e9 Compare July 23, 2022 23:49
@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch 3 times, most recently from 6e56fb2 to dbc6e55 Compare July 27, 2022 18:03
@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from dbc6e55 to eaeabad Compare August 6, 2022 17:57
@Omikhleia
Copy link
Member Author

Update:

  • Rebased on current master (b339e27) and updated to 0.14 API
  • Inputter and packages split
  • Overall:
    • Test fails (at least) because of the quick "ptable" backport and similar hiccups, so I haven't bothered yet.
    • Still, it runs smoothly, even via CLI (e.g. SILEX -u inputters.markdown -u packages.autodoc documentation/c13-markdown.md)

In the meantime, lunamark has moved merging PR 36, so work can resume after my return from vacations.

@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from 756f2fd to 104adae Compare August 7, 2022 18:15
@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from 104adae to a154871 Compare August 12, 2022 21:21
@Omikhleia
Copy link
Member Author

Update:

  • Rebased on current master (58da8ad)
  • Chored lunamark "vendored" version to master as of Aug. 12, 2022 - which now has all the minimum features we need here, yay.

@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from c3c28fa to 275f376 Compare August 13, 2022 22:16
@Omikhleia
Copy link
Member Author

Update:

  • The markdown inputter generates a more compact AST + minor robustness fix.
  • Framework for testing the inputter with expectations vs. reconstructed "human-readable" SIL-like reconstruction of the AST.

I think this is getting quite close to completion now (note). Besides more test cases, all we now need is the ptable and textsubsuper packages to be available in master (so as to remove the shims here, rebase the PR orderly and do some proper refactor before review). @alerque What do you think would be a decent time-frame for that to occur?

(note) There are of course still areas that could be extended, but they'd all require more work on lunamark (e.g. link attributes, line blocks) and/or adequate SILE building-blocks (e.g. external formatters).

@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from 275f376 to c652ad8 Compare August 14, 2022 07:00
@Omikhleia Omikhleia force-pushed the native-markdown-improvements branch from c652ad8 to b2329cc Compare August 14, 2022 07:07
@Omikhleia
Copy link
Member Author

Omikhleia commented Aug 17, 2022

I am refactoring this as an "external" luarocks package.

@Omikhleia Omikhleia closed this Aug 17, 2022
@Omikhleia
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

markdown class is broken
2 participants