[XeTeX] difficulty with ancient greek diacriticals in XeTeX with Gentium Plus

Richard Cobbe cobbe at ccs.neu.edu
Sat Jul 20 19:01:55 CEST 2013

On Thu, Jul 18, 2013 at 09:56:37AM -0400, Richard Cobbe wrote:
> This may be slightly OT, since I think it's a font problem rather than a
> XeTeX problem, but I'm hoping someone here may be able to give me a few
> pointers.  If not, please forgive the noise.
> I recently switched from using Gentium
> (http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=gentium) to
> Gentium Plus to typeset some classical Greek text, and I'm now getting
> different results when I use combining diacriticals in the XeTeX input.
> I've attached a very small XeTeX example that demonstrates the problem,
> along with the output I get.  (The .tex file is in UTF-8.)  The first line
> in the document body, as indicated, uses a precomposed Unicode character;
> the second line uses equivalent combining diacriticals.  At least, I
> thought they were supposed to be equivalent; as you can see from the PDF,
> the output is different.  In the PDF, the "precomposed" line is the desired
> output -- the diacriticals are supposed to be stacked, not superimposed.
> This same input file works fine (stacked rather than superimposed
> diacriticals) if I switch back to Gentium, which suggests that the
> difference is in the fonts, rather than in XeTeX.
> It's much more convenient for me to use the combining diacriticals, for
> various reasons that aren't all that interesting here.  Is there something
> in XeTeX/fontspec I can do to make that input work again, or is this a font
> problem?

Thanks very much to all who responded!  Folks have been very helpful, and
I'm always interested to learn more about modern font technology.

To summarize: it looks like there are some problems with Gentium Pro, and
I'll send a list of all of the issues I've noticed to Lorna Evans off-list.

I can either wait for a release of the font that fixes these problems, or I
can switch to using precomposed characters with XeTeX.  And it looks like I
can do this either with the TECKit map that Nathan Sidoli provided, or by
changing the program that I'm using to generate the XeTeX input.

By the way, Georg, that's why I prefer to use the combining characters as
input.  I'm not writing the XeTeX by hand, I'm generating it with a program
that does some processing on the Greek before producing the XeTeX source as
output.  Because of the particular computation I'm doing, the program needs
an internal representation of Greek text that's a lot closer to "letter
plus combining diacriticals" than it is to the precomposed Unicode
representation.  I was hoping to avoid writing the code to convert the
decomposed text to composed text, because it's just a big lookup table
that's really annoying and error-prone to type in.  Happily, the TECKit map
that Nathan provided is another alternative.

(I could use the normalization routines in ICU, but in the past I've had
problems using the ICU in MacPorts with Haskell on a Mac due to unfortunate
linker problems.  There have been a couple of new releases of the Haskell
Platform since, though, so it may be worth another try.)

Thanks again to all who responded!


More information about the XeTeX mailing list