[luatex] PDF strings.

Heiko Oberdiek heiko.oberdiek at googlemail.com
Sun Nov 28 18:47:05 CET 2010


On Sun, Nov 28, 2010 at 04:51:55PM +0100, Paul Isambert wrote:

> As you may know, PDF reads strings encoded in either its own scheme,
> or in UTF-16.
> My problem is I want accented characters in bookmarks, e.g:
> 
> \pdfoutline goto name {there}{Héhé}
> 
> I can't feed PDF-encoded strings directly (LuaTeX would complain),
> but PDF also understands \nnn-encoded characters, where \nnn is a
> number in base-8, so I actually do
> 
> \pdfoutline goto name {there}{\octal{Héhé}}
> 
> where \octal calls Lua to convert each character into \nnn, so that
> LuaTeX actually reads (the backslash has catcode 12):
> 
> \pdfoutline goto name {there}{\110\351\150\351}
> 
> which works fine, except the number are Unicode, which the PDF
> encoding doesn't follow exactly, so that I also need a mapping from
> Unicode to PDF encoding, and then octal.

If hyperref is loaded, you can use \pdfstringdef.
Otherwise package `stringenc' might help, PDFDocEncoding
and UTF-16 are supported. But I think I haven't
implemented big char support (chars with character code > 255
for LuaTeX and XeTeX) yet (it's done in hyperref). If this is
true, then the package might still help you at some conversion
steps.

> The other solution is UTF-16BE, which can't be fed directly to
> LuaTeX (you can't use \nnn to represent bytes). I.e you can't have:
> 
> \pdfoutline goto name {there}{\sixteen{Héhé}}
> 
> where \sixteen would return a string encoded in UTF-16BE, because
> LuaTeX will complain about non-utf-8 characters. You can do that
> directly in Lua, though, but there's no pdf.outline primitive:
> 
> pdf.outline(to_utf16("Héhé"))

You can still use \nnn, because this notation is restricted
to plain ASCII ('\', '0', ... '9'). Of course the \nnn are
used for bytes only.

Yours sincerely
  Heiko Oberdiek


More information about the luatex mailing list