[XeTeX] Anchor names

Heiko Oberdiek heiko.oberdiek at googlemail.com
Sat Nov 5 17:55:53 CET 2011

On Sun, Nov 06, 2011 at 12:57:12AM +0900, Akira Kakuto wrote:

> > > > I have disabled to reencode pdf strings to UTF-16 in xdvipdfmx: TL trunk r24508.
> > > > Now
> > > > /D<c3a46e6368c3b872>
> > > > and
> > > > /Names[<c3a46e6368c3b872>7 0 R]
> We can choose that both of the above are UTF16BE with BOM,
> by reencoding both of them. Which do you think is beter?

The main problem is that arbitrary byte strings are needed.
Example with a reference to a destination in another file:

    \vrule width0pt height200bp depth0pt\relax
    % Link annotation at (150bp,50bp)
    \raise130bp\hbox to 0pt{%
       \kern70bp %
         pdf:ann width 4bp height 2bp depth 2bp<<%
           /Border[0 0 1]%
           /C[0 0 1]% blue border
             % Result: <66f6f8>, but ** WARNING ** Failed to convert input string toUTF16...
             % /D<c3a46e6368c3b872>%
             % Result: <feff00e4006e0063006800f80072>
       \vrule width4bp height2bp depth2bp\relax

It seems that *all* literal strings are affected by the
unhappy reconversions. But the PDF specification lets no choice,
there are various places for byte strings.
In the example, if a file name has byte string XY and the destination Z,
then the file name is XY and the file name Z and nothing else. Otherwise
neither the file or the destination will be found.

Thus either (XeTeX/)xdvipdfmx finds a way for specifying arbitrary
byte strings (at least for PDF strings(/streams)) -- it is a
requirement of the PDF specification. Or we have to conclude 
that 8-bit is not supported and that means US-ASCII.

Yours sincerely
  Heiko Oberdiek

More information about the XeTeX mailing list