[luatex] PDF strings.
Paul Isambert
zappathustra at free.fr
Sun Nov 28 16:51:55 CET 2010
Hello all,
As you may know, PDF reads strings encoded in either its own scheme, or
in UTF-16.
My problem is I want accented characters in bookmarks, e.g:
\pdfoutline goto name {there}{Héhé}
I can't feed PDF-encoded strings directly (LuaTeX would complain), but
PDF also understands \nnn-encoded characters, where \nnn is a number in
base-8, so I actually do
\pdfoutline goto name {there}{\octal{Héhé}}
where \octal calls Lua to convert each character into \nnn, so that
LuaTeX actually reads (the backslash has catcode 12):
\pdfoutline goto name {there}{\110\351\150\351}
which works fine, except the number are Unicode, which the PDF encoding
doesn't follow exactly, so that I also need a mapping from Unicode to
PDF encoding, and then octal.
Even though \nnn is able to represent a number up to 512, only the first
256 are understood (beyond, the encoding seems to repeat itself, i.e.
\nnn is understood modulo 256), because PDF encoding represents only
latin1. For my purpose (French), it's ok, but I guess some people must
be somewhat annoyed...
The other solution is UTF-16BE, which can't be fed directly to LuaTeX
(you can't use \nnn to represent bytes). I.e you can't have:
\pdfoutline goto name {there}{\sixteen{Héhé}}
where \sixteen would return a string encoded in UTF-16BE, because LuaTeX
will complain about non-utf-8 characters. You can do that directly in
Lua, though, but there's no pdf.outline primitive:
pdf.outline(to_utf16("Héhé"))
(See http://www.ntg.nl/pipermail/dev-luatex/2007-December/001175.html
for a three-year old discussion.)
Anyway that would solve a symptom rather than the problem.
A solution is
function outline_title(str)
tex.sprint(pdf.immediateobj("(" .. to_utf16(str) .. ")"))
end
and then
\pdfoutline attr {/Title \directlua{outline_title("Héhé")} 0 R } goto
name {there}{}
It works, but it produces a bookmark with two /Title fields; the second
one (the one we want) is selected, but that's no good PDF. (Anyway we
can't use \pdfoutline as is.)
Ok, if you've read up to here and aren't too bored, congratulations, and
then: any suggestions?
And to our dear developpers: any plan to make LuaTeX writes everthing in
UTF-16 when necessary?
Best,
Paul
More information about the luatex
mailing list