[XeTeX] Res: small caps not searcheable

Adam Twardoch list.adam at twardoch.com
Tue Aug 4 18:54:01 CEST 2009

Jonathan Kew wrote:
> My position is that what xetex and xdvipdfmx is doing here is correct.
> XeTeX is determining which glyphs to use, by means of the requested
> OpenType feature. That's its complete responsibility. In order to
> enhance the usability of the PDF it creates (which would print fine
> regardless), xdvipdfmx is creating a CMAP, and it is using the font's
> encoding as its primary source to do this. (If there are unencoded
> glyphs -- as the small caps *ought* to be -- it is supposed to fall back
> on glyph names to try and determine the mapping.) 

I believe this is a rather simplistic approach. I believe at least an
option, or even the default behavior, would be to back-track both the
unencoded glyphs _as well as_ the glyphs encoded in the PUA to their
"parent" codepoints by the means of reversing the OpenType Layout
lookups. This is something Adobe have been doing in InDesign for a long
time. It's actually not that hard, either.

I don't consider PUA mapping of glyphs that are otherwise accessible
only through user-selectable OpenType Layout features a "bug". But I
maintain that PUA should be considered the last resort source for
implying the "text value" of a glyph stream by a PDF authoring
application. Non-PUA codepoints should be primary, then "parent
codepoints" obtained by reversing OTL lookups, then perhaps glyphnames
and PUA codepoints only as the very last ones.



Adam Twardoch
| Language Typography Unicode Fonts OpenType
| twardoch.com | silesian.com | fontlab.net

The illegal we do immediately.
The unconstitutional takes a little longer.
(Henry Kissinger)

More information about the XeTeX mailing list