[tex4ht] dvilualatex and tex4ht

Johannes Wilm mail at johanneswilm.org
Sat Jul 23 04:06:44 CEST 2011

On Fri, Jul 22, 2011 at 3:00 PM, Karl Berry <karl at freefriends.org> wrote:

> You are reading far too much into my blog linking to rms's note.  It was
> not a reply to you.  I don't engage in that kind of indirection.
> Let me put it this way: if you or anyone can send a patch to the tex4ht
> sources to support luatex better, while not increasing the overall
> maintenance burden, that would be welcome.

I see, yeah I don't doubt tat with sufficient time investment I would be
able to figure out about various small patches or work-arounds. I just
wondered whether in which part it would make more sense to invest time to
achieve the goal.

I found this answer by some dude on the internet as to how he extracts text
from a luatex-file. i realize that extracting all the layout is much more
complicated, but if there are hooks for everything then it should be doable,
and it would seem to make more sense for luatex, especially also given that
ConTeXt apparently already does this:



here is something useful you can easily do with luatex:

\directlua {
local words = io.open('hyphens-' .. tex.jobname .. '.txt', 'w');
local outchar = unicode.utf8.char
local function dumphyphens (head)
   local data = {}
   for v in node.traverse(head) do
       if v.id == node.id('glyph') then
         data[#data+1] = outchar(v.char);
       elseif v.id == node.id('disc') then
          data[#data+1] = '-'
       elseif v.id == node.id('glue') then
         data[#data+1] = outchar(32)
       elseif v.id == node.id('hlist') then
         data[#data+1] = dumphyphens(v.list)
   return table.concat(data)
callback.register ('hyphenate', function (head,tail)
   lang.hyphenate(head, tail)
   words:write (dumphyphens(head) .. outchar(10))

\input knuth

This will write to hyphens-\jobname.txt a dump of all the characters that
luatex has been asked to add hyphenation points to from this point on in the
source, with the output of each callback call on a single line.

Although you cannot use this to generate Epub (mostly because macros like
\TeX create two lines of output, once 'E' only for the lowered hbox and once
'TEX' for the actual macro use) the output is still useful because you can
check to make sure there are no potential bad breaks allowed by the
hyphenation patterns you are using.
I will now try to figure out at least the two last remaining issues this
time around: font-encoding and pgfplots, even if it's nothing more than a

> P.S. CVR has actually done a lot more than me in terms of work on tex4ht
> since Eitan's death.  I hope that one day we will actually be able to
> make another release, but the complications are well-nigh unbelievable.

Ok, good luck with that!

Johannes Wilm
tel: +1 (520) 399 8880
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tex4ht/attachments/20110722/968967ad/attachment-0001.html>

More information about the tex4ht mailing list