Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Math (MathML and/or TeX math) #59

Open
SimonSapin opened this issue Mar 21, 2013 · 27 comments
Open

Math (MathML and/or TeX math) #59

SimonSapin opened this issue Mar 21, 2013 · 27 comments
Labels
feature New feature that should be supported

Comments

@SimonSapin
Copy link
Member

Some way to include math equations in WeasyPrint documents would be nice, preferably in vector form.

Possible leads include:

@yvess
Copy link

yvess commented Oct 30, 2013

I'm also looking for this feature. Or at least inline svg support, like mentioned here #75

Phantomjs (http://phantomjs.org/) could be used to preprocess the html with javascript (needed for mathjax).
you could than get the output with the svg in it.
But this would need proper svg inline support.

I also tried the html+css renderer of mathml, but the output doesn't look good.
See
http://dl.yas.ch/mathml-nojs-css.html
compare it to
http://dl.yas.ch/math-css.pdf - generated with weasyprint

but this could be a starting point

@stroobandt
Copy link

The lack of math support has also been a showstopper for me. Personally, I use MathJax to display math on my web pages. This was suggested and works great with Pandoc, the tool I use to generate XHTML from Markdown.
As mentioned above, I could live with SVG math graphics but not with PNG, GIF or the like.

@SimonSapin
Copy link
Member Author

@serge-stroobandt Thank you for informing us of this life-threatening showstopper. I will be looking forward to your contribution.

@SimonSapin
Copy link
Member Author

Apparently Wikipedia wants to run MathJax server-side using PhantomJS. See MathJax’s wiki.

@SimonSapin
Copy link
Member Author

The new kid in the block is https://github.com/Khan/KaTeX. It apparently supports "server-side" rendering in node.js.

@cben
Copy link

cben commented Jan 8, 2015

  • Pandoc has several ways to output math; without any options like --mathjax it tries pure HTML + unicode (falling back to tex source where it fails): http://johnmacfarlane.net/pandoc/demo/mathDefault.html
    While not very pretty, it may be an immediate workaround...
  • KaTeX can produce static HTML + CSS + web fonts. These come in 4 formats including .ttf; I suppose these can be used by Pango. Note that KaTeX's math support is much poorer than MathJax — notably no array/matrix support, and many symbols missing.
  • MathJax 2.5 also added a browser-independent HTML + CSS output ("CommonHTML") but so far it's quite ugly, only usable as a quick preview until MathJax renders it better. (They plan to improve it, among other things by using web fonts.)
  • Server-side MathJax has evolved into https://github.com/mathjax/MathJax-node, producing MathML or SVG. (If it matters, the SVG embeds any needed chars as <defs>.) It should be easy to use such SVGs as inline data URLs.
    I just tried to use MathJax-node on a simple formula and converted to PDF with cairosvg - it worked though the fonts are too bold to my taste (MJ's svg output generally looks bolder than their HTML):
    https://pdf.yt/d/7DiwGM5Dxljv-M2w

P.S. If all you need is convert markdown into pretty PDF with math, it begs the question why go through HTML and not LaTeX...

@stroobandt
Copy link

@cben

"Why go through HTML and not LaTeX...?"

That is a good question with a non-trivial answer.
The short answer is because neither LaTeX nor ConTeXt are intended for unattended typesetting.

Allow me to elaborate on this. Like yourself, I also use Pandoc to generate XHTML web pages from a Markdown source document. This works great! To make the web pages more appealing, I illuminated my document with left floating miniatures with text wrapping around these. This renders fine with XHTML and CSS.

However, both LaTeX and ConTeXt fail miserably at dealing with these miniatures in the vicinity of page breaks. The Q&A site tex.stackechange.com is riddled with my failed attempts and no expert advice would help.

This is why I switched to unattended typesetting in CSS using the proprietary Prince XML software.
The input is Pandoc-generated XHTML and the PDF output gets page breaking always right, both for Letter and A4 paper format. An unexpected additional advantage is the sheer speed of rendering with Prince XML in comparison to both LaTeX and ConTeXt.

However, one problem remains; math support. MathJax cannot be used with Prince XML because JavaScript support is incomplete. Up to now, I have been using Prince XML with MathML, but the quality of the output is lousy.

Thanks to your SVG comment, cben, I revisited the problem and came up with a proposal for SvgTex support in Pandoc. This would allow to inject SVG math into the Pandoc-generated HTML, making its HTML completely standalone for math visualisation. It would also solve the Prince XML math issue.

Please, express your support in favour of this proposal over at the Pandoc forum!

Thanks!

@stroobandt
Copy link

I wrote a Haksell Pandoc JSON filter for SvgTex myself. The code is up at
https://groups.google.com/forum/#!msg/pandoc-discuss/MJggAXUmOII/PjkR8ILr_58J

dhimmel added a commit to dhimmel/manubot-rootstock that referenced this issue Nov 1, 2017
WeasyPrint was not renderring MathJax, so use --webtex to convert
equations to SVGs for PDF  output. See
Kozea/WeasyPrint#59
dhimmel added a commit to manubot/rootstock that referenced this issue Nov 1, 2017
Pandoc upgraded to v2.0.0.1

Swap anaconda packages to conda-forge

Update build.sh for pandoc 2.0

WeasyPrint was not renderring MathJax, so use --webtex to convert
equations to SVGs for PDF output. See
Kozea/WeasyPrint#59
dhimmel added a commit to manubot/rootstock that referenced this issue Nov 1, 2017
This build is based on
ec186c2.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/manubot-rootstock/builds/295841617
https://travis-ci.org/greenelab/manubot-rootstock/jobs/295841618

[ci skip]

The full commit message that triggered this build is copied below:

Update environment and documentation (#81)

Pandoc upgraded to v2.0.0.1

Swap anaconda packages to conda-forge

Update build.sh for pandoc 2.0

WeasyPrint was not renderring MathJax, so use --webtex to convert
equations to SVGs for PDF output. See
Kozea/WeasyPrint#59
dhimmel added a commit to manubot/rootstock that referenced this issue Nov 1, 2017
This build is based on
ec186c2.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/manubot-rootstock/builds/295841617
https://travis-ci.org/greenelab/manubot-rootstock/jobs/295841618

[ci skip]

The full commit message that triggered this build is copied below:

Update environment and documentation (#81)

Pandoc upgraded to v2.0.0.1

Swap anaconda packages to conda-forge

Update build.sh for pandoc 2.0

WeasyPrint was not renderring MathJax, so use --webtex to convert
equations to SVGs for PDF output. See
Kozea/WeasyPrint#59
@mb21
Copy link

mb21 commented Nov 7, 2017

Has anyone tested Lasem?

@liZe
Copy link
Member

liZe commented Nov 7, 2017

Has anyone tested Lasem?

It's exactly what we need, thanks for the link. So sad it's a C library that's not widely included in Windows GTK+ bundles or in Linux default packages.

@SimonSapin
Copy link
Member Author

Maybe compiling and distributing binary wheels is a viable approach: https://github.com/getsentry/milksnake

@pothos
Copy link

pothos commented Jun 19, 2018

Math support via pandoc works with a mathjax → SVG img tag filter:
https://github.com/lierdakil/mathjax-pandoc-filter

pandoc --filter ~/node_modules/.bin/mathjax-pandoc-filter -Mmathjax.centerDisplayMath -Mmathjax.noInlineSVG -f markdown+smart -t html5 -o test.html test.md

@ousia
Copy link

ousia commented Oct 22, 2018

@stroobandt,

I don’t know what you mean with unattended typesetting. ConTeXt deals with XML natively and it typesets XML sources. In fact, I generate XHTML output with pandoc and I typeset this XHTML files with ConTeXt directlty (no pandoc to ConTeXt conversion).

Sorry for the question, but the issue is relevant to me. It seems almost none knows that ConTeXt may be similar (or even superior) to Prince XML.

@mb21
Copy link

mb21 commented Oct 22, 2018

@ousia But ConTeXt doesn't interpret CSS, right?

@ousia
Copy link

ousia commented Oct 22, 2018

@mb21, ConTeXt doesn’t interpret CSS, only elements and attributes (pandoc does exactly the same).

Instead of CSS (such as Prince XML might do), an environment is needed. It might be considered similar to a XSL document.

A detailed explanation may be found at http://www.pragma-ade.com/general/manuals/xml-mkiv.

@ousia
Copy link

ousia commented Oct 23, 2018

@mb21, I forgot https://www.speedata.de/en/product/ is able to generate PDF files from XML sources and it may be configured with CSS files (at least, partially).

Unfortunately, it doesn’t support MathML speedata/publisher#107.

@stroobandt
Copy link

@ousia Below is an answer I once posted on tex.stackexchange.com to a question about the typesetting limitations of LaTeX. However, this limitation equally applies to ConTeXt.
There are even more answers to this very same question.

In a nutshell, unattended typesetting for me is hitting F5 on a very straightforward Markdown document and being presented shortly thereafter perfectly laid out PDF documents in both A4 and Letter format.

I tried very hard achieving this with both LaTeX and ConTeXt, but results were slow to produce and not satisfactory for all but the most basic documents.
However, PrinceXML allowed me to do so.

For example, have a look at the PDF versions of this Markdown document on my hobby website.


With markup converters like Pandoc it is now possible to generate LaTeX documents without ever touching any LaTeX code.

However, obtaining aesthetic page breaks for slightly complex documents, for example taking into account figures, widows and orphans may still require manual intervention in the LaTeX code.

Quoting Frank Mittelbach:

This issue describes the fundamental
problem in TeX’s approach: the program builds optimized paragraph
shapes without any knowledge about their final placement on a page.
The result is a “galley” from which columns are cut to a specified
vertical size. A consequence of this is that one can’t have the shape
of a paragraph depend on its final position on the page when using
TEX’s page builder algorithm.

In summary, it seems we are not quite yet at that utopian point were one can blindly write content without ever having to worry about how the output will look like in LaTeX. Anyhow, LaTeX is not really intended for unattended typesetting.

For this reason, I now resort to automatic CSS typesetting with PrinceXML for any content that is longer than a letter.
The PDF printouts on my web site are generated this way without any user intervention. This was not possible with LaTeX 2ε for the reasons mentionned above, eventhough I tried hard!

If you think of it, HTML+CSS is exactly intended for that: unattended typesetting on screens of unpredictable dimensions. A printed page is merely another media viewport.

On print-css.rocks, one can follow the latest developments in unattended CSS paged media typesetting.

@ousia
Copy link

ousia commented Dec 17, 2018

In a nutshell, unattended typesetting for me is hitting F5 on a very straightforward Markdown document and being presented shortly thereafter perfectly laid out PDF documents in both A4 and Letter format.

Many thanks for your detailed reply, @stroobandt.

I press F9 to generate an XHTML document with pandoc, which is automatically typeset with ConTeXt (see https://github.com/ousia/from-pandoc-to-context/tree/master/doc).

The command that the shortcut triggers is similar to:

pandoc -t html -o file.xml file.md && context --environment=file.tex file.xml

From what I see in your documents, I think that “floats are a pain in TeX” might be a more accurate description. I use almost no float myself.

User intervention may not be required, but I’d say that sed shouldn’t be needed when typesetting from HTML+CSS.

I’m afraid there might be something wrong with your document, since numbered lists have an issue (https://hamwaves.com/cl-ocfd/en/cl-ocfd.a4.pdf#page=39).

@stroobandt
Copy link

@ousia Hey, thanks for pointing out that issue with the numbered list. I corrected its CSS now.

A couple of months ago, I also made a magazine by processing Pandoc Markdown. I started out with ConTeXt. However, at one point I had to switch over to CSS and PrinceXML, simply because background images and more stuff were quickly getting too complicated to do using ConTeXt.

Using CSS for this is a breeze and document compiling using PrinceXML is way much faster.

magazine preview

From what I see in your documents, I think that “floats are a pain in TeX” might be a more accurate description. I use almost no float myself.

Not only that. There is the speed of production, which I already mentioned. LaTeX & ConTeXt also have issues with widows and orphans in fully automated, unattended production.

Furthermore, with CSS one has way more control over where to allow page breaks. For example: not allowing page breaks right after a subtitle but one or two paragraphs down. This holds true not only for titles but any combination of paragraphs, figures, tables, formulas, etc.

User intervention may not be required, but I’d say that sed shouldn’t be needed when typesetting from HTML+CSS.

True, however I could also have implemented this preprocessing step as a Pandoc filter written in Haskell. That would certainly handle border cases better. I have done so once for inline math. However, so far, I have only made baby steps with Haskell. Getting it done in sed works quicker for me.

Anyhow, the sed preprocessing is only required for achieving a couple of niceties like not ending a line with a period followed by a single very short word; for example the article "A" or "The".

@ninest
Copy link

ninest commented Feb 21, 2019

I'm not sure if this is still an issue. But to implement math, you can use this site and take the images. For example, if I want x/y, I will use the following HTML:

<img src="https://latex.codecogs.com/gif.latex?%5Cfrac{x}{y}">

@mbarkhau
Copy link
Contributor

mbarkhau commented May 15, 2019

I'm currently working on a Markdown extension that uses the offline rendering of KaTeX. I was hoping this would be a good candidate to use with WeasyPrint as it doesn't require JavaScript.

This is a test page I generated: https://gist.github.com/mbarkhau/ff263164cd162ff1fd734c2b0ce23241

The stylesheet uses some properties which are not supported by WeasyPrint

WARNING: Ignored `text-rendering: auto` at 126:3, unknown property.
WARNING: Ignored `width: min-content` at 153:3, invalid value.
WARNING: Ignored `fill: currentColor` at 902:3, unknown property.
WARNING: Ignored `stroke: currentColor` at 903:3, unknown property.
WARNING: Ignored `fill-rule: nonzero` at 904:3, unknown property.
WARNING: Ignored `fill-opacity: 1` at 905:3, unknown property.
WARNING: Ignored `stroke-width: 1` at 906:3, unknown property.
WARNING: Ignored `stroke-linecap: butt` at 907:3, unknown property.
WARNING: Ignored `stroke-linejoin: miter` at 908:3, unknown property.
WARNING: Ignored `stroke-miterlimit: 4` at 909:3, unknown property.
WARNING: Ignored `stroke-dasharray: none` at 910:3, unknown property.
WARNING: Ignored `stroke-dashoffset: 0` at 911:3, unknown property.
WARNING: Ignored `stroke-opacity: 1` at 912:3, unknown property.
WARNING: Ignored `stroke: none` at 915:3, unknown property.

The file being referred to is https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.css

Despite these warnings, the rendering is still quite good.

This is how it is rendered by WeasyPrint

katex_test_weasyprint

Here in Chrome 74 and Firefox 66

katex_test_chrome

katex_test_firefox

Should I open a separate issue for supporting these properties (assuming that's the reason the rendering is not on par with the browsers)?

@mbarkhau
Copy link
Contributor

I was able to comment out every line that generates a warning and the rendering remains fine in the browser. In other words, there is something else about the rendering that differs from what the browsers do.

@liZe
Copy link
Member

liZe commented May 16, 2019

I'm currently working on a Markdown extension that uses the offline rendering of KaTeX.

That's a good idea! Thanks for sharing.

Despite these warnings, the rendering is still quite good.

It is, and that's good news as the HTML and CSS structures are quite complicated.

Should I open a separate issue for supporting these properties (assuming that's the reason the rendering is not on par with the browsers)?

You should. It's pretty hard to debug as the HTML structure is crazy, but we could at least find the reasons why it doesn't work.

The main problem in the whole document (except from the ones from your screenshots) is the missing square root symbols. It's caused by #75.

adebali pushed a commit to CompGenomeLab/lemur-manuscript-archive that referenced this issue Mar 4, 2020
Pandoc upgraded to v2.0.0.1

Swap anaconda packages to conda-forge

Update build.sh for pandoc 2.0

WeasyPrint was not renderring MathJax, so use --webtex to convert
equations to SVGs for PDF output. See
Kozea/WeasyPrint#59
@stroobandt
Copy link

This is to inform the community that math2svg is now available as an officially Pandoc adopted Lua filter: https://github.com/pandoc/lua-filters/tree/master/math2svg

This Lua filter for Pandoc converts LaTeX math to MathJax generated scalable vector graphics (SVG) for insertion into the output document in a standalone manner. SVG output is in any of the available MathJax fonts.

This is useful when a CSS paged media engine (such as WeasyPrint) cannot process complex JavaScript as required by MathJax.

No Internet connection is required when generating or viewing SVG formulas, resulting in both absolute privacy and offline, standalone robustness.

Personally, I have been using it for quite some time to generate PDFs with MathJax generated formulas in an unattended typesetting workflow using Prince XML.

Here is a brief sample document:
https://hamwaves.com/zc.measuring/en/zc.measuring.letter.pdf

More intricate documents with Markdown source, makefile and CSS are available from the same web site.

@liZe
Copy link
Member

liZe commented Jan 17, 2021

This is to inform the community that math2svg is now available as an officially Pandoc adopted Lua filter:

Good to know, thanks a lot for sharing this information!

@grewn0uille
Copy link
Member

grewn0uille commented Sep 12, 2022

Hello!

As it’s soon our 2-year anniversary as CourtBouillon, we opened a short survey to know more about your expectations.
Don’t hesitate to support this feature and give it a boost 🚀!

The survey will be opened until October 10th.

Update: the survey is now closed. You can find the results here.

@grewn0uille
Copy link
Member

Hello!

As you may know, two weeks ago was CourtBouillon 3-year anniversary 🎉.

For this occasion, we prepared a short survey to have your opinion on this year’s features and to know what you’d like to see in the future!
Don’t hesitate to give a boost to this feature ✨️

The survey is opened until November 19.

ploegieku added a commit to ploegieku/2023-functional-homology-paper that referenced this issue Aug 6, 2024
Pandoc upgraded to v2.0.0.1

Swap anaconda packages to conda-forge

Update build.sh for pandoc 2.0

WeasyPrint was not renderring MathJax, so use --webtex to convert
equations to SVGs for PDF output. See
Kozea/WeasyPrint#59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature that should be supported
Projects
None yet
Development

No branches or pull requests