Hello Till,

(just for the record: this comes from a discussion on tex.sx: http://tex.stackexchange.com/q/43033/243 )

> Is it possible/desirable to let the LuaTeX PDF generator automatically tag words which are hyphenated at the end of line with a matching /ActualText attribute (so that the sequence of glyphs "hyphen- ation", for example, is internally represented as the sequence of characters 'hyphenation')? That would make sense from a linguistic viewpoint because the display of a text in a PDF is strictly presentational and may differ from its lexical and grammatical structure. It would also ensure that you can search for and find words in a LuaTeX-generated PDF with almost any viewer.

This might be achieved by using LuaTeX's ability to modify a node list after line breaking. But I am not totally sure if one can modify the text to look like
/Span << /E (hyphenation) >>

(hyphen-) Tj

(ation) Tj /Span 



