[XeTeX] hyperref bookmark problem

Dohyun Kim nomosnomos at gmail.com
Mon Oct 27 13:02:07 CET 2008

2008/10/27 Peter Dyballa <Peter_Dyballa at web.de>:
> The Mapping=tex-text option does its job only in the "text body."
> Hyperref puts a copy of your text aside and handles it itself. Use
> real – or — characters! Discuss this behaviour also with Heiko
> Oberdiek (in case Jonathan Kew does not give an explanation or
> solution)!

Hmm... my explanation was not enough.  Sorry.

The problem is not that `--' is displayed as `-' and `-', which of
course I can accept.
The real problem is that characters in bookmark is not encoded correctly.
And this is not a problem directly related to `mapping=tex-text'.

Let me explain in detail: If I execute pdftk to dump the bookmark data,
the result is not en-dash or minus-minus but a strange character.

shell prompt$ pdftk test.pdf dump_data
InfoKey: Creator
InfoValue: LaTeX with hyperref package
InfoKey: Producer
InfoValue: xdvipdfmx (0.7.3)
InfoKey: CreationDate
InfoValue: D:20081027201833+09'00'
NumberOfPages: 1
BookmarkTitle: 1997&#45380;&#133;2001&#45380;
BookmarkLevel: 1
BookmarkPageNumber: 1

As you see, character 133 is NEXT LINE, not minus  or en-dash.

The package hyperref uses PD1 encoding (see pd1enc.def),
which is actually pdfDocEncoding defined in pdf reference manual from Adobe.
However, because xetex does not consider this encoding,
some characters beyond latin1 range in the bookmark displayed incorrectly.

Above all, because of this wrong encoded character,
skim (a popular mac application for viewing pdf) does not display
the encoding-broken character and all string following that character.

Certainly I can use en-dash (U+2013); but then there's no reason to provide
tex-text mapping feature in xetex.

Dohyun Kim

More information about the XeTeX mailing list