[pdftex] DVI to HTML mechanism demo

Graham Toal gtoal at gtoal.com
Sat Oct 27 20:55:27 CEST 2007


Hello folks, it's a long time since I've been involved in the TeX
world, and I'ld like to say I'm back, but I'm just dropping in for a
short visit.

Every now and then I wish I could do my web sites in TeX (and more
frequently, set a little maths in a web page) and I check to see if
there is an implementation of TeX in existence which generates plain
HTML web pages - i.e no plugin required, and not simply embedding a
large graphic image of a pahe in HTML.

Unless I've missed it, there is no such version of TeX and TeX still
hasn't taken over the web world - which it should have done years ago!

Well, last night I asked myself why not...  HTML now has absolute
positioning, which is what was missing from HTML years back when we
first said that TeX was not a good fit for the web.  So over about 3
hourslast night I hacked up a proof-of-concept dvi driver to generate
HTML using absolutely positioned <DIV> tags.  Apart from the obvious
problems (I.e. I haven't done any font metric work at all so
positioning is just approximate in the P.o.C), it clearly shows that
this is possible!

Have a look at the Proof of Concept output and source code:

 http://www.gtoal.com/src/dvi2html/testdoc.html
    (teTeX is not involved in this technology, it's just a test
document I picked at random)

 http://www.gtoal.com/src/dvi2html/dvi2html.c.html

- short, easy to hack up as a demo... I'm very confdent that someone
who is current in TeX (and especially is knowlegable about using fonts
other than the standard TeX-supplied ones) could build a real driver
in a similar style in about a week.  I suspect it could have a lot in
common with pdfTeX?

The most effective way would be to generate the DVI using the fonts
and font metrics of the standard (Win & Mac) HTML fonts (
http://www.ampsoft.net/webdesign-l/WindowsMacFonts.html ).  A less
well targetted way would be to map arbitrary fonts to that set, which
would let you use existing dvi files.  (This would be similar to the
Type & Set system Graham Asher wrote some years ago).  A third option
would be to have a central server with graphic images of the TeX
characters (to avoid the need to download fonts, allowing the pages to
be viewed in the correct fonts by anyone - this is not a good choice
though, I prefer the first option).  A fourth option may be to
download the fonts in the HTML page itself (which I think is possible
but I've never done it myself and don't know how it's done).

Anyway, here's my challenge to the TeX developer community.  Create a
version of TeX that generates HTML as per the demo above.  As well as
a stand-alone version, implement it as a filter which can be inserted
into an apache web server to render .dvi files directly, and also, if
it is possible, to render .tex files directly (via a pipeline
involving dvi).  The advantage of the latter is that you can reformat
pages on the fly using the page size from the browser, so that you get
things like balanced columns done properly on screen.  (There's some
issues with page size and printing from HTML pages; it can be done but
needs some thought. It may be easier to just link to a pdf if you want
to print a tex-generated HTML page)

I hope someone finds the demo inspiring enough to give this a try.  I
don't have the time to take on a project like this myself right now,
and I'm 15 years out of touch with TeX internals.  I'm not the guy to
do this, but I hope someone here is!  Take it away, guys...

Graham


More information about the pdftex mailing list