[tex-hyphen] [lltx] [luatex] towards non-standard hyphenation support in LuaTeX

Hans Hagen pragma at wxs.nl
Tue Jan 28 00:04:23 CET 2014


On 1/27/2014 9:23 PM, Stephan Hennig wrote:
> Am 27.01.2014 19:53, schrieb Hans Hagen:
>
>> in most cases writing node list manipulation code in lua is efficient
>> enough (the average page has only a couple of thousands nodes) so then
>> you have control over that; the more is hardcoded, the more you have to
>> fight (complex macros at the tex end can be more of a bottleneck)
>
> Fair enough!  Still, TeX already has a notion of what a word subject to
> hyphenation is.  If that could be exposed somehow, that would not be a
> bad thing.  Anyway, I don't want to push you.

as i assume that you want to loop over the resulting word (range) 
anyway, there is not much gain (in fact, you have then a iterator pass 
as well as a follow up pass over the word);

btw, i occasionally need work iterators but never felt the need for an 
iterator, but here is a quick and dirty one:

1000 * tufte (761 nodes, 544 chars, 93 words)

1874384 nodes/second
1339901 chars/second
  229064 chars/second

for first, last, n in node.words(head,true) do
     ..... -- true checks for lccode
end

so, the time spent in the iterator is much less than the time 
typesetting them (esp if the iterator is meant for something high-end)

local glyph    = node.id("glyph")
local disc     = node.id("disc")
local traverse = node.traverse
local lccode   = tex.lccode

function node.words(head,strict)
     if not head then
         return function() end
     end
     local next = head
     return function()
         if not next then
             return nil, nil, 0
         end
         local first
         for n in traverse(next) do
             local id = n.id
             if id == glyph then
                 if not strict then
                     first = n
                     break
                 elseif lccode[n.char] > 0 then
                     first = n
                     break
                 end
             elseif id == disc then
                 first = n
                 break
             end
         end
         if not first then
             return nil, nil, 0
         end
         next = first.next
         if not next then
             return first, first, 1
         end
         local last = first
         local size = 1
         for n in traverse(next) do
             local id = n.id
             if id == glyph then
                 if not strict then
                     last = n
                     size = size + 1
                 elseif lccode[n.char] > 0 then
                     last = n
                     size = size + 1
                 else
                     break
                 end
             elseif id == disc then
                 last = n
             else
                 break
             end
         end
         next = last.next
         return first, last, size
     end
end

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------


More information about the tex-hyphen mailing list