[XeTeX] Anchor names

Heiko Oberdiek heiko.oberdiek at googlemail.com
Sat Nov 5 17:55:53 CET 2011


On Sun, Nov 06, 2011 at 12:57:12AM +0900, Akira Kakuto wrote:

> > > > I have disabled to reencode pdf strings to UTF-16 in xdvipdfmx: TL trunk r24508.
> > > > Now
> > > > /D<c3a46e6368c3b872>
> > > > and
> > > > /Names[<c3a46e6368c3b872>7 0 R]
> 
> We can choose that both of the above are UTF16BE with BOM,
> by reencoding both of them. Which do you think is beter?

The main problem is that arbitrary byte strings are needed.
Example with a reference to a destination in another file:

\catcode`\{=1
\catcode`\}=2
\pdfpagewidth=100bp
\pdfpageheight=200bp
\shipout\vbox{%
  \kern-1in\relax
  \hbox{%
    \kern-1in\relax
    \vrule width0pt height200bp depth0pt\relax
    % Link annotation at (150bp,50bp)
    \raise130bp\hbox to 0pt{%
       \kern70bp %
       \kern-2bp
       \special{%
         pdf:ann width 4bp height 2bp depth 2bp<<%
           /Type/Annot%
           /foo/ab#abc
           /Subtype/Link%
           /Border[0 0 1]%
           /C[0 0 1]% blue border
           /A<<%
             /S/GoToR%%
             /F(t.tex)%
             /D<66f6f8>% 
             % Result: <66f6f8>, but ** WARNING ** Failed to convert input string toUTF16...
             % /D<c3a46e6368c3b872>%
             % Result: <feff00e4006e0063006800f80072>
           >>%
         >>%
       }%
       \vrule width4bp height2bp depth2bp\relax
       \hss
    }%
  }%
}
\end

It seems that *all* literal strings are affected by the
unhappy reconversions. But the PDF specification lets no choice,
there are various places for byte strings.
In the example, if a file name has byte string XY and the destination Z,
then the file name is XY and the file name Z and nothing else. Otherwise
neither the file or the destination will be found.

Thus either (XeTeX/)xdvipdfmx finds a way for specifying arbitrary
byte strings (at least for PDF strings(/streams)) -- it is a
requirement of the PDF specification. Or we have to conclude 
that 8-bit is not supported and that means US-ASCII.

Yours sincerely
  Heiko Oberdiek


More information about the XeTeX mailing list