-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better engine than pandoc/gfm? #5
Comments
html2text's output isn't great either: [fluffy](http://beesbuzz.biz/):
[Reblob!](http://beesbuzz.biz/blog/5385-Reblob):
> [Reblob!](http://publ.beesbuzz.biz/blog/179-Reblob):
>
>> It’s been a while since I’ve worked on IndieWeb stuff, but I finally got
around to releasing an _extremely preliminary_ version of
[reblob](http://publ.beesbuzz.biz/tools/1423-reblob), a little commandline
thingus to make this stuff easier. Eventually I’ll also have a server-based
version here, at least as an example.
>
> Of course this is the first entry I’ve written actually _using_ it. Lots of
rough edges but whatever! which renders as:
|
Found this through your tweet. There might be a way to use one of pandoc's many customization options to fix this. E.g., you could try to remove soft line-breaks by using a pandoc filter: function SoftBreak ()
return pandoc.Space() -- replace soft linebreak with a space
end Use by calling pandoc with |
@tarleb Not particularly, the way that pandoc works through Pypandoc makes that incredibly unwieldy. But there's also no reason for that in a Pandoc filter, see the branch https://github.com/PlaidWeb/reblob/tree/feature/5-trim-end-whitespace for a simple fix on the Python side. But even with that there's a lot of stuff pypandoc does poorly that can't be easily addressed by setting markdown plugins either. The Mastodon version of the thread goes into more about that. |
There's also a bunch of other reasons I want to get off pandoc, like the Python bindings to it make a lot of assumptions about environment that won't work for one of my intended future use cases, and it's just, like, not very well-controlled in general. I can also think of a fairly straightforward way to convert HTML to Markdown in a way that will also allow me to put in Publ-markdown extensions. I was hoping reblob would be able to also support things like ReStructuredText for folks who use that on their blog engine though. |
Pandoc's
gfm
backend produces markdown like:which formats like
(from this entry).
html2text might be better, but that loses the ability to support other output formats. There might also be some better Pandoc configurations that could be used.
The text was updated successfully, but these errors were encountered: