Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate with jgm/pandoc #25

Closed
2 of 3 tasks
Tracked by #120
Witiko opened this issue Feb 18, 2018 · 6 comments
Closed
2 of 3 tasks
Tracked by #120

Integrate with jgm/pandoc #25

Witiko opened this issue Feb 18, 2018 · 6 comments
Assignees
Labels
feature request latex Related to the LaTeX interface and implementation tug 2021 Related to the TUG 2021 conference
Milestone

Comments

@Witiko
Copy link
Owner

Witiko commented Feb 18, 2018

Currently, a user of the Markdown package is restricted in their choice of syntax extensions to the ones provided by the Lua parser implemented in markdown.lua. To provide experimental ground for implementing new syntax extensions, support for the internal abstract syntax tree (AST) format of the jgm/pandoc converter will be added.

Currently, jgm/pandoc can be used to provide conversion from various input formats to Markdown:

pandoc -f docx -t markdown input.docx -o input.md

The Markdown package can then be used to convert the Markdown document to the TeX abstract syntax tree format (TeX AST) produced by the Markdown package:

texlua /path/to/markdown-cli.lua input.md input.tex

This representation can then be typeset. This is useful, but limited to the Markdown syntax extensions supported by our Lua parser.


The plan is to provide a jgm/pandoc Lua writer (see jgm/pandoc issues 4341 and 1541 for futher information) that will directly convert the jgm/pandoc AST to the TeX AST, circumventing the Lua parser altogether:

pandoc -f docx -t /path/to/markdown-pandoc_writer.lua input.docx -o input.tex

Adding initial support for a new syntax extension already supported by jgm/pandoc will then be as easy as adding a new procedure to the writer and defining the corresponding \markdownRenderer… macros. Full support can be added later by extending our Lua parser.


TODOs:

@Witiko
Copy link
Owner Author

Witiko commented Feb 18, 2018

Adding a jgm/pandoc reader for reconstructing a jgm/pandoc AST from a TeX AST would be also benefitial. This reader would be more of a plumbing tool for restoring a document from the intermediary TeX AST files for cases where the original sources are unavailable. However, there does not seem to be any Lua API for readers, so we either need to abuse the AST and create a Lua filter, or we would need to create a Haskell reader and contribute it to jgm/pandoc, as discussed in jgm/pandoc#1541 (comment). Similarly to a Haskell reader, we could also contribute a Haskell writer that would replace the Lua writer in the long run and would be maintained as a part of jgm/pandoc.

@Witiko Witiko changed the title Add a Pandoc writer Integrate with Pandoc Feb 18, 2018
@Witiko Witiko changed the title Integrate with Pandoc Integrate with jgm/pandoc Feb 18, 2018
@Witiko Witiko removed this from the 2.7.0 milestone May 5, 2019
@Witiko Witiko assigned Witiko and unassigned Witiko Aug 8, 2021
@Witiko Witiko added this to the 2.11.0 milestone Aug 8, 2021
@Witiko Witiko added latex Related to the LaTeX interface and implementation tug 2021 Related to the TUG 2021 conference labels Aug 8, 2021
@Witiko
Copy link
Owner Author

Witiko commented Aug 10, 2021

An exhaustive specification of the elements of Pandoc's AST format is available on Hackage.
The full list of Lua functions reserved for Lua writers is available in jgm/pandoc's src/Text/Pandoc/Writers/Custom.hs.

@Witiko
Copy link
Owner Author

Witiko commented Sep 14, 2021

[...] create a Haskell reader and contribute it to jgm/pandoc, as discussed in jgm/pandoc#1541 (comment). Similarly to a Haskell reader, we could also contribute a Haskell writer that would replace the Lua writer in the long run and would be maintained as a part of jgm/pandoc.

@drehak I have created a development environment for Pandoc using Docker at witiko/pandoc-devenv. We can use it to develop Haskell readers and writers for Pandoc without littering our base OS with Haskell. Those who want to litter their base OS can take inspiration in our Dockerfile.

@Witiko Witiko modified the milestones: 2.11.0, 2.12.0, 2.13.0 Oct 1, 2021
@Witiko
Copy link
Owner Author

Witiko commented Dec 30, 2021

A preliminary analysis by for the implementation has been authored by @drehak and published in the CSTUG Bulletin 2021/1-4 (landing page, PDF).

@Witiko
Copy link
Owner Author

Witiko commented Feb 27, 2022

@drehak We should aim to close this issue before milestone 2.15.0 (due on March 31), since the defense of your student project will likely take place before then and also because we'd like to publicize your proof of concept in a journal article for TUGboat 43:1 (also due on March 31, see #120).

@Witiko
Copy link
Owner Author

Witiko commented Apr 5, 2022

Tentative roadmap for @xrehak's bachelor's thesis

Future work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request latex Related to the LaTeX interface and implementation tug 2021 Related to the TUG 2021 conference
Projects
None yet
Development

No branches or pull requests

2 participants