[latex2html] latex2html and web publishing

Franco Bagnoli bagnoli@dma.unifi.it
Sat, 30 Dec 2000 19:19:57 +0100 (CET)

On Fri, 29 Dec 2000, Les Richardson wrote:

> >This latex is translated to html (and xml/mathml in the future) and cached
> >to improve the response of the system.
> Why don't you just dynamically generate the text but serve it statically
> so that there is no performance degradation.

I was unclear in my exposition, this is exactly what I do. There is no
delay in reading the pages, but when you write you need to
preview your work quite frequently (if you are not fluent in LaTeX) and in
this case there can be a long waiting time. Not really a problem, I was
just wondering if the conversion could be accelerated.

> I generate pdf using pdftex. It is very quick. (visible at: 
> www.sasked.gov.sk.ca/testbank)

It looks very interesting. I'm developing something similar (I am a
physicistm too). Is your project proprietary or could I adapt
and integrate it with mine? 

> >Another possibility I am
> >exploring is to facilitate image reusing by storing all images in a common
> >directory. Clearly, the finer the level of image splitting, the higher the
> >possibility of sharing them.
> But a particular image is generated for use with a particular html file. 
> These normally all go into a common folder which is formatted with your css 
> of choice. How are you storing your data now? I'm still wrestling with this 
> now, since I store the images for test items in a particular web visible 
> folder while the data items reside as records in the database. I'm still 
> not particularly happy with this at this point, but storing the images in 
> the database will have performance implications that I don't like either.

I'm using a modified form of twiki (twiki.sourceforge.org) so I adapted
its directory scheme. There is a data directory tree where the source
files (say, pieces of latex) are stored in subdirectories. For instance,
the page titled "atoms" of folder "physics" is stored in 
data/physics/atoms.txt. rcs is used to store and control the various
versions, so there exists also data/physics/atoms.txt,v .

If there exists files (say image1.gif and mytext.tex) attached to this
pages, they are stored in pub/physics/atoms/ . When the page is saved, the
html, the gifs generated by latex2html, the pdf etc.  is stored in
cache/physics/atoms/ which is served dynamically (since each user can
choose the framework -- say plain pages or frames or language -- in which
the text is embedded). 

Latex2html uses a plain text database to choose which image can be reused
in a given text. Since several pages are similar (generally the ones done
by the same  author) I think that one could share this database (stored in
a real database like mysql) to globally cache mathematical expressions and
the corresponding gifs. 

I'm not using zope for the same reason you encountered: it tends to store
images and text into the database, which is not very efficient (moreover,
one could run a daemon to clean items not accessed for a given amount of

maybe we should continue this discussion privately. 

Franco Bagnoli
Dipartimento di Matematica Applicata "G. Sansone"
Universita' di Firenze, Via S. Marta, 3 I-50139 Firenze, Italy
tel. +39 0554796422, fax: +39 055471787
e-mail: bagnoli@dma.unifi.it