[pdftex] Revisiting (About CJKbookmarks)

Ross Moore ross at ics.mq.edu.au
Thu Mar 2 04:28:40 CET 2006


Hi Heiko,

On 23/02/2006, at 8:42 PM, Heiko Oberdiek wrote:

> On Thu, Feb 23, 2006 at 12:24:10PM +1100, Ross Moore wrote:
>
>> I want to revisit this thread, but in connection with placing
>> mathematical symbols into bookmarks.
>
> \texorpdfstring, \pdfstringdefDisableCommands

Thanks for these.
My problem was not so much *where* to make alternate definitions,
but *what* these should be for Unicode strings.

Indeed, for my application it might be useful to have a macro:

   \texorpdforXMLorHTMLorliteral

in which appropriate \catcode changes were made with each
variant of the macro-expansion.    :-)


>> So if I want to replace the strings 'lambda', 'alpha', 'omega', etc.
>> by appropriate unicode representations,
>>
>>  a.  what needs to go into the .out file ?
>>
>>  b.  what else needs to be done ?
>>       e.g.  options to hyperref, or \hypersetup
>
>
> Many Greek letters are already supported, given as \text... macros.
>
> \usepackage[unicode]{hyperref}
>
> \pdfstringdefDisableCommands{%
>   \let\lambda\textlambda
>   \let\alpha\textalpha
>   \let\omega\textomega
>   % etc.
> }

OK. It's the double-octal notation used for Unicode strings
that I'd not encountered before. Thanks for the heads-up.


This works (so far) in my setting, with the following provisos:

  a.  the .out  file more than doubles in size, which
      increase occurs also in the PDF.
      But this is only ~5kb increase, so no big deal really.

Presumably this could be reduced by using Unicode only for
those bookmarks that really need it.


  b.  the loading of  puenc.def  causes a macro-name clash,
      with those math-authors who like to define \C
      as a shorthand for \mathbb{C} or  \mathcal{C}
      --- easily fixed, but most annoying.

      Presumably these guys never use cyrillics for Russian
      or Eastern European names in bibliographies.


> But this is not the problem with math.
> Bookmarks are not typesetted areas, they are just text strings.
>
>>  c.  what version of pdfTeX is needed ?
>
> I don't see a dependency from the pdfTeX version.

OK. That's nice to know.

>
>>  d.  what actual font will be used in the PDF browser ?
>>      Do I need to supply font subsets inside the .pdf file ?
>
> No, the fonts are not taken from the .pdf file but from the
> system, where the pdf browser is installed.

Yep; I got that impression from another reply.
It'd be nice to be able to de-reference a stream for this.
But if that's not in the PDF spec, then too bad.


>> Also,
>>   Is it possible to use different typefaces ?
>
> AFAIK you can use color or bold/italic for the whole string.

And you intend working on providing support for this, right ?
That's something that I could make some use of.

The Adobe document for the PDF 1.6 specs  shows what is needed
for colours and faces (italic and/or bold) in bookmarks.

However, the same document actually has a logo-image in each
of its own bookmarks!  How did they do that ?


>
>>   Can super/sub-scripts be supported in bookmarks ?
>
> Except for a few letters (twosuperior, ...) no.

Understandable, if a single string is all that's allowed.

However, there are raised and lowered letters in the
"Phonetic Extensions" area, and elsewhere.

I've now made use of these, to produce raised superscripts in
mathematics used for titles, etc. when it contains only:

     a.  letters,   excluding  fqzCFQSVXYZ
or  b.  digits 0-9
or  c.  symbols  + - = ( )
or  d.  punctuation , .  (i.e. comma or stop).

Similarly for subscripts, using just the characters in  b. and/or c.

The TeX coding to achieve this makes slight patches to some
hyperref methods:

   \HyPsd@@RemoveBraces      to retain markers of bracings
   \HyPsd at CatcodeWarning     to retain ^ and _
   \HyPsd at ConvertToUnicode   to allow some extra post-processing
                              before converting to Unicode

as well as adding new post-processing methods prior to
using \HyPsd at ConvertToUnicode :

   \raise at BracedSupscripts      handles ^{...}
   \remove at falseBracePairs      removes any left-over brace markers

and methods added via the \pdfstringdefPostHook :

   \replaceSupAst         ^* becomes just *
   \replaceSupscript      handles non-braced ^
   \replaceSubscript      handles non-braced _

as well as many macro re-definitions, via   
\pdfstringdefDisableCommands .


Some examples can be seen in the attached PNG snapshot images
--- if they make it through the list-server.
These show superscripts, subscripts and some exotic math-symbols.
(The PDF browser is Apple's 'Preview'.)

-------------- next part --------------

-------------- next part --------------

-------------- next part --------------

>
> It is possible that some pdf browsers support some own methods.
> xpdf seems to use "pango" for the bookmarks, whatever this means.

  No idea.

>
> Yours sincerely
>   Heiko <oberdiek at uni-freiburg.de>

Thanks, as always, for your help

	Ross

------------------------------------------------------------------------
Ross Moore                                         ross at maths.mq.edu.au
Mathematics Department                             office: E7A-419
Macquarie University                               tel: +61 +2 9850 8955
Sydney, Australia  2109                            fax: +61 +2 9850 8114
------------------------------------------------------------------------




More information about the pdftex mailing list