[luatex] What are user-defined whatsit nodes?
Hans Hagen
pragma at wxs.nl
Sun Nov 23 19:43:09 CET 2014
On 11/22/2014 11:25 PM, Stephan Hennig wrote:
> Am 21.11.2014 um 19:10 schrieb luigi scarso:
>> On Fri, Nov 21, 2014 at 6:31 PM, Stephan Hennig <sh-list at posteo.net> wrote:
>>
>>>
>>> To put it differently, is user-defined whatsits inhibiting ligatures a
>>> bug or intentional?
>>>
>>> intentional
>> anything non-glyph or non-disc will inhibit ligatures building
>
> OK, thanks for the clarification!
>
> These (side-)effects of user-defined whatsits make me wonder what
> use-cases have been in mind when introducing this type of node?
> Attributes seem so much more attractive for squeezing additional
> information into a node list, because they should be handled
> transparently (I think).
>
> But never mind, I'm currently looking into ways to make TeX smarter with
> regards to ligatures. I've recently found the selnolig package
> utilizing user-defined whatsit nodes for inhibiting selected ligatures.
> Before asking further questions regarding ligatures I just wanted to
> get some clarification about the whatsit approach.
>
> The selnolig's whatsit approach works, but it's again not free of
> side-effects. It gets in the way with the hyphenation algorithm in that
> the whatsit marks a word boundary causing problems with minimum
> hyphenation length calculation. Note how setting \righthyphenmin=5 in
> the attached example prevents t-t hyphenation in the word 'butterflies.'
> (Let's ignore the fact that the fl ligature is indeed valid in this
> example.) The problem is more serious in German with its many compound
> words. Babel shortcuts like "|, which insert real glue if I recall
> correctly, suffer from the same problem.
>
> Any ideas how to prevent selected ligatures without causing side-effects?
>
> Best regards,
> Stephan Hennig
>
> % -*- coding: utf-8 -*-
> \directlua{
> % Declare constants.
> local GLYPH = node.id('glyph')
> local WHATSIT = node.id('whatsit')
> local USER_DEFINED = node.subtype('user_defined')
> local CHAR_f = string.byte('f')
> local CHAR_l = string.byte('l')
> local Ncopy = node.copy
> local Nnew = node.new
> local Ninsert_before = node.insert_before
> local Ntraverse = node.traverse
> % Create user-defined whatsit.
> local what = Nnew(WHATSIT, USER_DEFINED)
> what.user_id = 20141117
> what.type = 100
> what.value = 0
> % Register callback.
> callback.register('hyphenate',
> function (head, tail)
> % Iterate over node list.
> for n in Ntraverse(head) do
> if n.id == GLYPH and n.char == CHAR_l then
> local p = n.prev
> if p.id == GLYPH and p.char == CHAR_f then
> Ninsert_before(head, n, Ncopy(what))
> end
> end
> end
> lang.hyphenate(head, tail)
> end
> )
> }
> \righthyphenmin=3
> \showhyphens{butterflies}
> \righthyphenmin=5
> \showhyphens{butterflies}
> \bye
Assuming some explicit control (you mention attributes) you can just
reconstruct the original from the (nested) ligatures, like:
local glyph_id = node.id("glyph")
function nolig(head)
current = head
while current do
local n = current.next
if current.id == glyph_id and current[999] == 1 then
local c = current.components
if c then
local t = node.slide(c)
local x = current
local p = current.prev
if p then
p.next = c
c.prev = p
else
head = c
end
if n then
t.next = n
n.prev = t
end
x.components = nil
node.free(x)
n = c
end
end
current = n
end
return head
end
-- hook this function into the hlist handler after font
-- handling; this is of course macro package dependent as
-- is the 999 attribute
\def\nolig#1{\begingroup\attribute999=1\relax#1\endgroup}
e\nolig{ff}e
e\nolig{ffi}cient
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
More information about the luatex
mailing list