[texhax] lwarp vs. tex4ht

Brian Dunn BD at bdtechconcepts.com
Mon Mar 21 03:22:00 CET 2016

On Sunday, March 20, 2016, Uwe Lueck wrote:
> > This is a LaTeX package which causes LaTeX to directly generate HTML
> > tags, using pdftotext and a few other utilities to convert the resulting
> > PDF file into HTML files.
> The approach is interesting, yet if you convert LaTeX to PDF
> and the result to HTML, the meaning of "direct" forbids calling
> this a "direct" generation of HTML. It is just as "direct" as
> the late Eitan Gurari's tex4ht.

The difference being that in the case of lwarp, LaTeX itself is deciding and 
generating the HTML tags.  The only post-processing is the extraction of text 
from PDF, which is then renamed with an .html suffix.  (Also, there is 
additional processing of graphics images by other programs.)

> I have never used tex4ht, but my impression is that this is the
> most promising way to get HTML from LaTeX. So you should tell
> what lwarp offers that tex4ht doesn't.

tex4ht does a reasonable conversion when fed the lwarp test suite.  There are 
some differences in ability and ease of configuration.

Based on a quick review of the tex4ht interpretation of the lwarp test suite, 
differences include:

- xcolor color box and frame box
- HTML entities for various kinds of fixed-width spaces
- epigraphs
- lwarp doesn't do tabular <{} and >{} columns yet, or | vertical rules
- lwarp does prettier booktabs, but cannot do (lr) trimming yet
- they each have different ideas about vertical alignment of tabular rows, but 
LaTeX and HTML have different abilities here, and they do not totally overlap.
- math can be MathML in tex4ht, and is svg with LaTeX copy/paste in lwarp
- sfrac is better in lwarp
- \nameref to a figure gives the section name in tex4ht but the figure caption 
in lwarp
- \pageref provides a useful link in lwarp
- lwarp can do rotatebox, scalebox, and reflectbox (thanks to CSS3), but HTML 
does not adapt the whitespace appropriately, so this is of limited use.
- tex4ht didn't handle a picture environment in an fbox.  Vertical space was 
not provided.  This seems to be true of all the boxes in the test suite.
- texh4 didn't handle an fbox with tikz inside, but I haven't tried very hard 
to get it to work yet.
- tex4ht places footnotes on a separate HTML page ( this may be a 
configuration option).  lwarp places them at the bottom of each section or 
HTML page.
- By default, tex4ht is placing newlist items inline, but I haven't tried to 
change it yet.
- Due to CSS3, lwarp is able to place minipages side-by-side with user-
selectable vertical alignment.
- Also due to CSS3, lwarp is able to use multiple columns, which adapt to page 
- lwarp can float-right the comments in algorithmicx.
- lwarp generates less clutter in the HTML output.  (Where it comes to math, 
they're both pretty bad. Lots of images v.s. lots of MathML.)
- texh4 can generate several kinds of output beyond HTML

The package and test suite are both provided in the .zip file found on the 
website below.


Brian Dunn
BD Tech Concepts LLC
bd at BDTechConcepts.com

