old TeX accents producing Unicode glyphs (was [XeTeX] Re: XeTeX segmentation faults)

Ross Moore ross at maths.mq.edu.au
Sat Jun 19 13:00:12 CEST 2004

On 15/06/2004, at 9:41 PM, Jonathan Kew wrote:

> Rendering `` and '' as "smart" double quotes is a feature of the .tfm 
> files associated with CM and other standard TeX fonts; it isn't normal 
> behavior for AAT or OpenType fonts. (In principle, they could include 
> ligature rules to do this, but they're not used that way in practice.) 
> When using Unicode fonts, there are separate characters for the 
> various "real" typographic quotes; that's the appropriate thing to 
> use.
> (For existing documents, it would be possible to write clever TeX 
> macros that replace `` with \char"201C, etc., but I really wouldn't 
> recommend it. If you're working with legacy TeX fonts, use the old TeX 
> conventions for quotes, dashes, accents, etc.; but if you're working 
> with Unicode text and fonts, use the proper Unicode characters.)

LaTeX already has a mechanism that allows the old TeX way of requesting 
to result in the correct Latin-1 or Unicode glyph.

For the U encoding, that we are using with AAT fonts with XeTeX, the 
method is
to make definitions as follows:

% handles \={#1} for macron accents with U encoding
  \expandafter\def\csname U\string\=\endcsname#1{%
    \expandafter\@text at composite \csname U\string\=\endcsname#1{\@empty}%
     \@text at composite {\add at accent {9}{#1}}}

% specific cases  \={a}   \={A}
%  replace each ???? below with the appropriate hex-code

% needs lines like the above, for every accented letter that
% may be used, and resides in the AAT font ...

% ... otherwise the following may work for some letters

%% alternative expansions, using composite of 2 glyphs:
% \expandafter\def\csname\string\U\string\=-a\endcsname{a^^^^0304}
% \expandafter\def\csname\string\U\string\=-A\endcsname{A^^^^0304}

The way this works is that the \@text at composite  causes LaTeX
to check whether there is a non-trivial macro with a name like:
(The TeXpert methods to get this is shown above, using
   \expandafter, \csname and \string !).

If so, use it; else use  \add at accent {9}{#1}  which would
construct the combination of a glyph for the accent positioned
above (or below) the letter passed as #1.

Clearly if you want the resulting PDF to be searchable for words
containing accents, then you'll need the correct Unicode glyphs,
rather than using the alternative expansion above.

There is also a possible difficulty here when switching between
different AAT fonts, each declared as U-encoded, but actually
supporting different glyph sets.
This difficulty arises because the U (or T1 or OT1) encoding
declaration actually refers to the 'input' encoding for fonts
in the source code of a document, but we are using it in connection
with the output stream.

To get around this robustly, each glyph set for an AAT font really
ought to have a separate encoding designation.
In practice this is effectively an encoding for each AAT font;
  e.g. ULG = Lucida Grande,  UHT = Hoefler Text,  etc.

When a font-switch is used, then the encoding should be changed also.
We also need to check all the places where the "current" encoding
is used, and make appropriate adjustments in .sty or .fd files
written for use of a particular AAT font.

For example, there need to be definitions such as

  \expandafter\def\csname ULG\string\=\endcsname#1{%
    \expandafter\@text at composite \csname 
     \@text at composite {\add at accent {9}{#1}}}

and similarly for other accents, when using Lucida Grande,
as well as

  \expandafter\def\csname UHT\string\=\endcsname#1{%
    \expandafter\@text at composite \csname 
     \@text at composite {\add at accent {9}{#1}}}
when using Hoefler Text.

Also, the \add at accent  macro should probably be rewritten to construct
the alternative methods:  a^^^^0304  when these are applicable.

Sorry I don't have the free time, or the compelling need, to write
full packages for particular fonts.
Anyone needing this, please use these ideas and post your packages
back to this list.

> Regards,
> Jonathan

Hopefully someone will find this useful.



> _______________________________________________
> XeTeX mailing list
> postmaster at tug.org
> http://tug.org/mailman/listinfo/xetex
Ross Moore                                         ross at maths.mq.edu.au
Mathematics Department                             office: E7A-419
Macquarie University                               tel: +61 +2 9850 8955
Sydney, Australia                                  fax: +61 +2 9850 8114

More information about the XeTeX mailing list