In this blog post, I will explain a simple way of transforming a Latex document to HTML. Why doing this? There are many reasons. For example, you may have formatted some text in Latex and would like to quickly integrate it in a webpage.
The wrong way
First, there is a wrong way of doing this. It is to first create a PDF from your Latex document, and then use a tool to convert from PDF to HTML. If you try this and the document is even just slightly complex, the result may be very bad… and the HTML code may be horrible with many unecessary tags.
The good way
Thus, the best way to convert Latex to HTML is to use some dedicated tool. There are several free tools, but many are designed to run on Linux. If you are using Windows, it may thus take you some time to find the right tool.
Luckily the popular Latex distributions like MikTek and TexLive include an executable of a softwate to convert from Latex to HTML that works on Windows. Thus, if you have the full TexLive distribution, you do not need to download or install anything else. Below, I will describe how to do with TexLive on Windows.
Using TexLive on Windows
First, you need to open the command line and go to the directory containing your Latex document. Let say that your Latex document is called article.tex. Then, you can run this command:
The result will be a new file article.html
The result is usually quite good. For example, I have converted a research paper that I wrote about high utility episode mining and the results looks like this:
I would say that 90 % of the paper was converted correctly. There are some other parts that I have not shown like some pseudocode for some algorithms that were not formatted properly. But I would say that the conversion is on overall really good.
In this blog post, I have shown a simple way of converting Latex to HTML on Windows using the TexLive distribution. If you are using MikTex or Linux, similar commands can be used.
Philippe Fournier-Viger is a computer science professor and founder of the SPMF open-source data mining library, which offers more than 170 algorithms for analyzing data, implemented in Java.
Pingback: Useful Latex tricks for Writing Research Papers | The Data Mining Blog
A list of methods to convert LaTeX to HTML:
Each of the following is being developed currently, and each supports hundreds of LaTeX packages:
htlatex, mentioned above, is part of TeX4ht. There is a newer interface, make4ht, and it can generate HTML, ebooks, Docbook, and ODT, along with MathML or MathJax.
Lwarp generates HTML and either SVG or MathJax math. Some assistance is provided for ebook conversion and word processor copy/paste. Lwarp supports the most packages, and adds enhanced MathJax support for dozens more.
Both Lwarp and TeX4ht use TeX itself for much of the processing, and thus should provide good compatibility with complicated TeX expressions.
LaTeXML generates HTML and MathML via a PERL program.
Also of note is Pandoc, a markup-based system which can process many types of input and output, but is limited in its LaTeX conversion.
Thanks for the great comment!