[XeTeX] misplaced combining diacritical marks 2

Alexander Schultheiß aschulth at googlemail.com
Wed Sep 1 20:03:03 CEST 2010


Hey David,

> This is just not true.  The example I sent you works correctly on my
> machine; I deliberately included y-macron-acute which does not exist in
> precomposed form in Unicode.

I don't think they work correctly, in the sense that they are
positioned according to the anchor points. I've attached a pdf with
the results of the little investigation I did. For example, I've tried
various methods to get a m+macron+acute with dotbelow (starting from
line 2.1). It confirms my suspicion that xelatex doesn't care about
anchor points in complex glyphs. It "fakes" them. I assembled the
correct glyphs by hand put them in a different font and printed them
in line 3.1 and 3.3 respectively.

I think I roughly figured out how xelatex deals with diacritics in
otf/ttf fonts (Latin scripts) and it's rather "crude". It dismissed
them almost completely. The procedure seems to be the following:

1.) It checks whether a sequence of commands matches a glyph in a
Unicode code point (by comparing Unicode names I guess). If so, it
prints the glyph (if the glyph is not present, it prints a generic
blank); if not, and the resulting glyph doesn't match any Unicode
glyph xelatex tries to assemble the glyph via anchor points.

2.) Further diacritics are added depending on the whether the new
"base glyph" (i.e. x+macron, where x is a variable) is a pre-composed
glyph or whether xelatex had to assemble it.

2.1) If the glyph was pre-composed it looks for anchors in the
pre-composed glyphs. If it finds none, it simply adds the combining
diacritical mark with its _fixed_ negative width sometimes with
horrible results (see line 2.2 first glyph and line 2.3 all glyphs)
and sometime with acceptable but incorrect results (see line 2 first
glyph).

2.2) If the glyph was not pre-composed but assembled by xelatex, it
tries to center the diacritic somehow or relative to something else.
The logical way would be to center it based on the width of the glyph
(left and right bearings included). Though I have centered some glyphs
by hand (3.2 & 3.4) and the results differ.

3.) The situation is similar if we add yet another diacritic. If the
glyph to which we add the diacritic is pre-composed and has an anchor
point the anchor determines where the stacked diacritic goes. If it is
pre-composed but has no anchor the stacked diacritic get centered on
certain height (or default distance to topmost extrema?). However, if
the "base glyph" was itself assembled, the stacked diacritic seems to
get centered not according to glyph width but according to the
diacritic below it.

As for the \char" command, it seems to merely center all diacritics
according to overall glyph width. So in essence, the only time xelatex
cares about anchors is when when it has to add another diacritic to a
pre-composed glyph. In all other cases it dismisses anchors and tries
to figure out the position by itself.

My question is, does anybody know how xelatex deals with diacritics
(i.e. knows the code, has a manual) and is my analysis correct? If so,
I can work around by pre-composing all glyphs I need. However, I
wonder, if xelatex was designed to handle otf/ttf fonts why not honor
anchors _if_ they are supplied by the fontmaker and revert to the
other mechanisms in case no anchor is encountered?

Alex
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test4.pdf
Type: application/pdf
Size: 18901 bytes
Desc: not available
URL: <http://tug.org/pipermail/xetex/attachments/20100901/8957103e/attachment-0001.pdf>


More information about the XeTeX mailing list