[tex4ht] problem with slow compilation of large latex file with large math content

Nasser M. Abbasi nma at 12000.org
Sat Mar 26 08:21:28 CET 2016


On 3/26/2016 1:11 AM, Radhakrishnan CV wrote:
> On Sat, Mar 26, 2016 at 3:23 AM, Nasser M. Abbasi <nma at 12000.org> wrote:
>
> ​[...]​
>
> For example, for one file, using Vbox, it took 14 hrs
>> for make4ht to compile the file to html. On cygwin, it took
>> little less than than. About 10 hrs. This is on windows 7, 64 bit
>> 16 GB ram, fast intel i7-3930k CPU.
>>
>
> ​That is terrible! But, it contradicts with my own experience. At work, we
> do large documents (on an average 300 pages long, 800-1000 bibliographic
> items, 500 to 800 equations, very complex math, large number of figures,
> double column output) on a daily basis, but it takes a few seconds to
> generate Elsevier XML output. Recently, another article with 350 pages, ~70
> figures, four or five very long tables each spanning several pages, 350 bib
> items, several hundred cross references, but very few math, took only 12
> secs for three runs of TeX4ht to generate NLM XML output on a server where
> at least 50 users are working simultaneously using same resources. The only
> documents that take, say, 60 secs or a bit more time are documents with
> atomic and nuclear data tables, each table running to 200 pages typically!
> Otherwise, tex4ht run is a breeze in my experience that too on a server
> shared by at least forty to fifty users at a time.
>
> [...]
>>
>
>> But the issue is, pdflatex and lualatex take about 5 minutes
>> on the same file to compile it to pdf !
>>
>> I can understand converting to HTML will take more time,
>> since each equation is converted to svg image,
>
>
> ​on the fly? Why don't you write out the math in a file and process
> separately to generate the svg images in one go?​
>

Sorry, I do not understand what this mean. I have latex
file, which contains math, and then call tex4ht to
generate the HTML. I use make4ht to compile it and tell
it to use svg for math.

> [...]
>
>
>> It also seems tex4ht has more than one pass. As I see it
>> generating these sequence of numbers  more than one time.
>>
>
> ​tex4ht needs three passes for fixing cross links and multicolumns in
> tables.
>
Ok. But each pass is slow., as is seems to go through
the whole pages over and over again.
>

>
>> I can make a zip file with typical large latex file
>> with all the images it uses and my .cfg and main.mk4
>> and the command I used to compile the latex file if
>> any one wants to confirm this problem. Would this be ok?
>>
>
> ​I would love to debug your problem. Please do send me. If it is too large
> the archive, kindly put at some location and provide me the URL.
>
Thanks for the offer to look into it. I put the latex file
and all the needed include files I use and the .cfg and main.mk
and the command in one zip file. Here it is, in this folder:

http://12000.org/tmp/032616/

There is file call compile.sh which has this line:

make4ht --lua -u -c ./nma.cfg -e ./main.mk4 report.tex "htm,3,pic-align,notoc*"

THe report.tex there is 17 MB large.  You'll see the slow down
as it pages get to over [1000]... etc.. it will take few hours
to compile.

Please let me know if you need anything else or anything
I can try on my end. I made sure all the file needed there.
If I missed something, will update.


>
> ​[...]
>>
>> Finally, is there a document that describes the passes/process
>> that tex4ht uses to compile to HTML at some high level? Like block
>> diagram, or such. I am not able to find such design document.
>
>
> ​A schematic diagram of a tex4ht run namely tex4ht.pdf is attached to this
> mail. Hope this might help.​
>

Thanks for the diagram. But there should really be a more
detailed design document for tex4ht. For something as
important as tex4ht.

> ​Best regards​
>

thank you,
--Nasser



More information about the tex4ht mailing list