[luatex] information about ligatures
Stephan Hennig
mailing_list at arcor.de
Fri Jan 3 18:07:25 CET 2014
Am 31.12.2013 09:37, schrieb Paul Isambert:
> Stephan Hennig <mailing_list at arcor.de> a écrit:
>
> Ligatures are char nodes (id 37) with special subtype 2, and they have
> a “components” field which is a nodelist containing the ligature’s
> components.
I have already read about subtype 2 and the components field, but have
never seen a glyph node of that subtype in pre_linebreak_filter.
Instead, I can see glyph nodes of subtype 256 corresponding to standard
Unicode ligatures, e.g., 0xfb02 (fl). That is, bit 8 is set in subtype,
which I can't find any documentation about. For that reason, I have
never checked the 'components' field, but it is indeed there. Thanks!
Attached is a tiny node list printer. It hooks into
pre_linebreak_filter and prints the type and subtype of each node in a
list and some more information for glyph and disc nodes on the next
line. Here's the beginning of the node list corresponding to the word
'flavour':
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: fl 0XFB02 components: t left: 2 right: 3 lang: 0 font: 16
> [node] glyph subtype: 0 next: t prev: n
> [node] +char: f 0X66 components: n left: 2 right: 3 lang: 0 font: 16
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: l 0X6C components: n left: 2 right: 3 lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: a 0X61 components: n left: 2 right: 3 lang: 0 font: 16
> [node] kern subtype: 1 next: t prev: t
In fact, all top-level glyph nodes seem to be of subtype 256 in
pre_linebreak_filter. What does that mean? (You can find the full node
list corresponding to TeX input 'flavour specific office trick' at the
end of this mail. With a proper font, the ck ligature is also present
there.)
Can somebody please provide TeX input that results in a glyph node with
bit 1 of subtype set?
> Note that you should also consider discretionary nodes; and
> “pre_linebreak_filter” will not catch ligatures in boxes (use
> “hpack_filter” for that).
Yeah, I am aware of that.
Happy new year!
Stephan Hennig
> This is LuaTeX, Version beta-0.76.0-2013120414 (rev 4627) (format=lualatex 2013.12.11) 3 JAN 2014 18:05
> [...]
> [node] whatsit subtype: 6 next: t prev: t
> [node] hlist subtype: 3 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: fl 0XFB02 components: t lang: 0 font: 16
> [node] glyph subtype: 0 next: t prev: n
> [node] +char: f 0X66 components: n lang: 0 font: 16
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: l 0X6C components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: a 0X61 components: n lang: 0 font: 16
> [node] kern subtype: 1 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: v 0X76 components: n lang: 0 font: 16
> [node] kern subtype: 1 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: o 0X6F components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: u 0X75 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: r 0X72 components: n lang: 0 font: 16
> [node] glue subtype: 0 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: s 0X73 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: p 0X70 components: n lang: 0 font: 16
> [node] kern subtype: 1 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: e 0X65 components: n lang: 0 font: 16
> [node] disc subtype: 3 next: t prev: t
> [node] +pre
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: - 0X2D components: n lang: 0 font: 16
> [node] +post
> [node] +replace
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: c 0X63 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: i 0X69 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: fi 0XFB01 components: t lang: 0 font: 16
> [node] glyph subtype: 0 next: t prev: n
> [node] +char: f 0X66 components: n lang: 0 font: 16
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: i 0X69 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: c 0X63 components: n lang: 0 font: 16
> [node] glue subtype: 0 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: o 0X6F components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: ffi 0XFB03 components: t lang: 0 font: 16
> [node] glyph subtype: 0 next: t prev: n
> [node] +char: ff 0XFB00 components: t lang: 0 font: 16
> [node] glyph subtype: 0 next: t prev: n
> [node] +char: f 0X66 components: n lang: 0 font: 16
> [node] disc subtype: 3 next: t prev: t
> [node] +pre
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: - 0X2D components: n lang: 0 font: 16
> [node] +post
> [node] +replace
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: f 0X66 components: n lang: 0 font: 16
> [node] glyph subtype: 0 next: n prev: t
> [node] +char: i 0X69 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: c 0X63 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: e 0X65 components: n lang: 0 font: 16
> [node] glue subtype: 0 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: t 0X74 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: r 0X72 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: i 0X69 components: n lang: 0 font: 16
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: c 0X63 components: n lang: 0 font: 16
> [node] kern subtype: 1 next: t prev: t
> [node] glyph subtype: 256 next: t prev: t
> [node] +char: k 0X6B components: n lang: 0 font: 16
> [node] penalty subtype: 0 next: t prev: t
> [node] glue subtype: 15 next: n prev: t
-------------- next part --------------
local unicode = require('unicode')
local Nid = node.id
local Ntraverse = node.traverse
local Ntype = node.type
local Sformat = string.format
local Srep = string.rep
local Uchar = unicode.utf8.char
local M = {}
local err, warn, info, log = luatexbase.errwarinf('print_node')
-- Table of functions printing detailed node information.
local print_node_details
-- A string one can grep for in the log file.
local grep_prefix = '[node] '
local function print_node_list(head, indent)
local grep_indent = grep_prefix .. Srep(' ', indent)
-- Traverse node list.
for n in Ntraverse(head) do
-- Print general node information.
texio.write(Sformat('%s%-12s subtype: %3d next: %1s prev: %1s\n', grep_indent, Ntype(n.id), n.subtype, n.next and 't' or 'n', n.prev and 't' or 'n'))
-- Print detailed node information.
if print_node_details[n.id] then print_node_details[n.id](n, indent) end
end
end
print_node_details = {
[Nid('glyph')] = function(n, indent)
local grep_indent = grep_prefix .. Srep(' ', indent)
texio.write(Sformat('%s+char: %s %#-8X components: %1s lang: %3d font: %3d\n', grep_indent, Uchar(n.char), n.char, n.components and 't' or 'n', n.lang, n.font))
-- Ligature components?
if n.components then print_node_list(n.components, indent+2) end
end,
[Nid('disc')] = function(n, indent)
local grep_indent = grep_prefix .. Srep(' ', indent)
texio.write(Sformat('%s+pre\n', grep_indent))
print_node_list(n.pre, indent+2)
texio.write(Sformat('%s+post\n', grep_indent))
print_node_list(n.post, indent+2)
texio.write(Sformat('%s+replace\n', grep_indent))
print_node_list(n.replace, indent+2)
end,
}
local function __cb_pre_linebreak_filter(head, groupcode)
print_node_list(head, 0)
return true
end
local function register_filter()
luatexbase.add_to_callback('pre_linebreak_filter', __cb_pre_linebreak_filter, 'print_node')
end
M.register_filter = register_filter
return M
-------------- next part --------------
\listfiles
\RequirePackage{luatexbase-mcb}
\documentclass{article}
\usepackage{fontspec}
%\setmainfont{Unifraktur Maguntia}
% available at http://www.google.com/fonts
\directlua{
local pn = require('print_node')
pn.register_filter()
}
\begin{document}
flavour
specific
office
trick
\end{document}
More information about the luatex
mailing list