[XeTeX] XeLaTeX PDF Glitch with Serbian Glyphs

Steve White stevan.white at gmail.com
Tue Sep 17 10:53:08 CEST 2013

Hi Alessandro,

Let me clarify a bit.

The current standards for text encoding, Unicode, does not distinguish
the Serbian Cyrillic glyphs -- they are regarded as a style.  A
stylistic substitution of glyphs can be specified, as you pointed out,
by an OpenType font.

PDF typically contains glyphs extracted from fonts, and supports a
table mapping of glyphs to Unicode.  With this mapping, the display
application may re-construct the Unicode text, for the purpose of
copying. What goes wrong here, is that many PDF generating libraries
fail to fill this PDF table out.  This means, the replacement Serbian
glyph is not correctly mapped to its original Unicode text.

LibreOffice is putting *styled* text into the clipboard.  That means,
it records which font was used to display the text.  When pasted into
another program that handles styled text (on a system with the same
font installed!!!), the result is as you describe-- substitutions are
carried out as instructed by the font.

However in neither case are you copying Serbian glyphs.  Your copied
Unicode characters.  With LibreOffice it also records that the text
was styled with a certain font.  And the PDF copying failed because
the PDF generating software failed to record which glyphs correspond
to which Unicode characters.

Some months ago, I made a comparison of PDF generating software in
this regard.  The functionality varies greatly among them.  This
situation could stand improvement.  I advocate that the libraries draw
mapping information from the tables in the font.

There is a standard involving the naming of glyphs in a font, which is
used by some font-generating libraries to guess the mapping, in the
ignorance of the OpenType tables that actuall *effect* the mapping.
This is a very crummy band-aid, but it can work in situations such as
yours, in some programs.  You just need to find a font where the
glyphs are named according to that standard, and a program that makes
use of that.

One other thought: I have never seen an application for display of
graphical documents such as PDF produce styled text, but...in
principle -- the names of the fonts are stored in the document, as is
information such as scaling of the glyphs... Sounds messy, but
possible.  Not convinced it's *desirable*.  But the first thing is to
get the mapping right.

On Mon, Sep 16, 2013 at 10:16 PM, Alessandro Ceschini
<alescesc1986 at yahoo.it> wrote:
> Serbian Cyrillic requires a peculiar localisation because some glyphs are
> different from the standard. The PDF produced by XeLaTeX however must have
> some glitch because if I try copy/paste from it to another document the
> characters affected by Serbian localisation simply disappear :-\ ! This
> doesn't happen with PDF produced by LibreOffice 4.1, which now supports
> OpenType and therefore localised glyphs: characters are correctly copied,
> and even if the recipient program doesn't support OpenType, then standard
> glyphs are displayed.
> --
> Alessandro Ceschini
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex

More information about the XeTeX mailing list