Compiling LaTeX to html with pandoc

Pandoc is a hugely versatile document converter written in Haskell. It is in active development with many contributors. There are many pandoc questions on tex.stackexchange.

This is the output for the tex file I’ve used in earlier tests.

Usage

pandoc --toc inputfile.tex -s --mathjax -o outputfile.html

The --toc option produces a table of contents (which will otherwise be omitted even if your LaTeX file has \tableofcontents), and -s makes it produce a complete html file. Images work well, double primes are legible even if they aren’t aligned exactly right. It doesn’t know about nolinkurl from the LaTeX hyperref package and silently fails to render \hrefs whose link text uses nolinkurl. qedhere doesn’t render.

Issues

The big problem is that section numbering and environments like theorem, definition, lemma, proof, and so on don’t work: in the html output section numbers are omitted and the text contents of the environment appears with no decoration.

I can’t find an easy way to fix this. There are several numbering filters for pandoc, e.g. pandoc-numbering or pandoc-eqnos, but they are not designed for conversion of LaTeX to html. There is an issue filed at the pandoc github asking for amsthm support, and it’s still open. Someone on the issue thread created pandoc-amsthm but despite the promising name, so far as I can see it is not for LaTeX-html conversion.

The University of Nevada, Reno has a page about math accessibility in which they describe their LaTeX-html conversion process. They create an html file rendering the math with mathjax, and then “update” it to restore theorem and definition environments and numbering. It’s not clear if they have an automatic tool for this, and the email address they provide for queries rejects mail from people not subscribed to their list.

Conclusion

It is probably possible to adapt pandoc or produce filters which will add environment support and numbering to its LaTeX-html output, but the lack of them at the moment makes it currently unsuitable.