[luatex] ActualText attribute for hyphenated words

Patrick Gundlach patrick at gundla.ch
Thu Feb 2 18:43:47 CET 2012


Hello Till,

(just for the record: this comes from a discussion on tex.sx: http://tex.stackexchange.com/q/43033/243 )

> Is it possible/desirable to let the LuaTeX PDF generator automatically tag words which are hyphenated at the end of line with a matching /ActualText attribute (so that the sequence of glyphs "hyphen- ation", for example, is internally represented as the sequence of characters 'hyphenation')? That would make sense from a linguistic viewpoint because the display of a text in a PDF is strictly presentational and may differ from its lexical and grammatical structure. It would also ensure that you can search for and find words in a LuaTeX-generated PDF with almost any viewer.

This might be achieved by using LuaTeX's ability to modify a node list after line breaking. But I am not totally sure if one can modify the text to look like
----------------------------------------
BT
/Span << /E (hyphenation) >>

BDC
(hyphen-) Tj

EMC
(ation) Tj /Span 

ET
----------------------------------------

Patrick


More information about the luatex mailing list