[tex4ht] what is the fastest way to convert large document to HTML?
martin.gieseking at uos.de
Mon Aug 20 11:18:40 CEST 2018
Am 19.08.2018 um 01:10 schrieb Michal Hoftich:
>> However, technically it shouldn't be necessary to convert all math fragments
>> every time the document is processed. In the (proprietary) infrastructure at
>> our university we only convert the portions that have actually changed. This
>> is simply done by using md5-based file names computed from the corresponding
>> LaTeX code. Before starting the actual conversion, the system checks if
>> there's already an SVG file present which matches the hash value of the
>> LaTeX code. If so, running LaTeX and dvisvgm can be skipped. This also has
>> the advantage that every fragment is created only once even if it's
>> referenced multiple times in the document.
> This is actually great idea. I've created simple Lua package which can
> process DVI pages and calculate MD5 hashes for their contents. make4ht
> can then rename files generated by Dvisvgm according to the hashes and
> replace the image names in HTML files. Zip file with all necessary
> files is attached. It can be executed with
that looks awesome. I'll have a closer look at your package later today.
Just a first observation: If I understand the dvireader script
correctly, it reads all bytes following a "bop" command until the "eop"
value 140 is reached. Since many DVI commands require additional
parameters, it's likely that one of these bytes is 140 as well so that
the MD5 sum will be computed only for a part of the page, i.e. changes
in the remaining section wouldn't be recognized.
Perhaps it's also possible to add the computation and comparison of the
hashes to dvisvgm because it processes the DVI file anyway. I have to
think about this a bit more.
More information about the tex4ht