Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] RAW format #6800

Closed
2 tasks
sebi5361 opened this issue Nov 2, 2020 · 8 comments
Closed
2 tasks

[Enhancement] RAW format #6800

sebi5361 opened this issue Nov 2, 2020 · 8 comments

Comments

@sebi5361
Copy link

sebi5361 commented Nov 2, 2020

Enhancement: Adding the raw format

Sometimes one might have to preprocess a file to make it pandoc-compatible at first.
Adding the raw format would allow to use the pandoc CLI syntax to make preprocessing like standard pandoc filtering.

Illustration

My Markdown file contains @sinx@ which is common for AsciiMath notation.
I need to parse it to render AsciiMath notation to $:a sinx$ to use yuwash/asciimathml-pandocfilter afterwards.
As @ is a special element in pandoc-markdown, this substitution needs to be done beforehand.
Currently I am using a pre-pandoc-parser, but with the added raw format doing so might become useless.
The raw format would prevent pandoc from parsing its input to its internal dialect, thus allowing filters not based on pandoc grammar necessarily.

Commands

Typing
cat file.md | pandoc -f raw -F my_pre_pandoc_parser_becoming_a_pandoc_standard_filter_now | pandoc...
rather than
cat file.md | ./my_pre_pandoc_parser | pandoc...


  • Is this a good idea?
  • Can it be done easily?
@mb21
Copy link
Collaborator

mb21 commented Nov 2, 2020

I think this already exists and it's called -f native ..?

Or maybe I misunderstand.. from your example seems that the proposed raw format wouldn't do anything? What's wrong with this?

cat file.md | ./my_pre_pandoc_parser | pandoc...

a filter is not a preprocessor...

@alerque
Copy link
Contributor

alerque commented Nov 2, 2020

The raw format would prevent pandoc from parsing its input to its internal dialect, thus allowing filters not based on pandoc grammar necessarily.

What you are asking for is simply nonsensical. The filter system operates on the native internal AST format. What you want is a preprocessor not a filter. So just pre-process it. If you want syntax sugar try something like pandoc <(my_ preprocessor file.md) ....

If your preprocessor converts to Pandoc's AST format then you can read it in as JSON or native syntax.

@sebi5361
Copy link
Author

sebi5361 commented Nov 2, 2020

Indeed what I was asking is nonsensical once understanding the philosophy around pandoc.
I have nothing against the syntax of using a preprocessor. It was more about syntax sugar indeed.
Thanks so much for your enlightenments 😃
Am closing this thread.

@sebi5361 sebi5361 closed this as completed Nov 2, 2020
@tarleb
Copy link
Collaborator

tarleb commented Nov 2, 2020

I think @sebi5361's proposal this is not too far from some ideas mentioned in #6393.

@tarleb
Copy link
Collaborator

tarleb commented Nov 2, 2020

BTW, on Mac and Linux you can simulate a raw reader like so:

pandoc <(printf '```{=raw}') input.file <(printf '```')

This will produce a single RawBlock element which can then be processed further.

@alerque
Copy link
Contributor

alerque commented Nov 2, 2020

@tarleb I think @sebi5361 is expecting the input to still be processed as Markdown by Pandoc.

@tarleb
Copy link
Collaborator

tarleb commented Nov 2, 2020

Thanks @alerque; I should have read more closely. 👍

@sebi5361
Copy link
Author

sebi5361 commented Nov 2, 2020

Very close indeed to what @tarleb mentioned.
Good to know about the {=raw} / RawBlock field encapsulation possibility.
Thanks to all of you for your reactivity and explanations 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants