[XeTeX] On combining diacritics again

Jonathan Kew jonathan_kew at sil.org
Fri Jan 20 13:02:46 CET 2006

On 20 Jan 2006, at 11:33 am, Nicola Vitacolonna wrote:

>> It's true, to some extent, that it depends on the text rendering
>> engine, in the sense that the Cocoa TextView (as used in TextEdit,
>> TeXShop's editor window, etc) will stack diacritics automatically,
>> without any specific help from the font. The positioning of the
>> diacritics may not be ideal, however; it's often more widely spaced
>> than a type designer would prefer.
>> XeTeX doesn't do this; it *only* performs diacritic positioning (etc)
>> in accordance with font tables. Gentium doesn't currently include
>> OpenType tables, and so diacritics won't stack; Charis SIL and Doulos
>> SIL do have OpenType positioning tables, and so diacritics will stack
>> properly.
>> JK
> Thanks, your answers are very clear. But my question was a bit  
> different, and I'll try to be more precise: is there any difference  
> between U+0117 (Latin Small Letter e with Dot Above) and the pair U 
> +0065 (Latin Small Letter e) plus U+307 (Combining Dot Above)?  
> Stacking a diacritic over the former is handled in the same way as  
> stacking a (second) diacritic over the latter? Does it depend on  
> the font?

In principle, they should behave exactly the same: U+0117 is  
"canonically equivalent" to <U+0065, U+0307>. And you should be able  
to add a diacritic to either of these, and get the same result.

In practice, many fonts may not support the two versions equally  
well, or in the same way. It depends how thoroughly the font  
developer has done the OpenType tables. At this point, there are not  
many Latin-script fonts available that have full support for  
arbitrary sequences of diacritics.

And of course, whether your ė is precomposed or not, a second  
diacritic will be properly positioned above the dot *only* if the  
font provides proper support for this. (Or if the text layout engine,  
like Cocoa, attempts some default positioning for you.)

Does this help? I think the basic answer is "yes, it does depend very  
much on the font" -- and few fonts yet do the right thing.

Users should pressure font developers to improve their Unicode  
support in this area. Many vendors think that simply providing  
precomposed forms for a common subset of accented letters is  
sufficient, but this makes their fonts inadequate for those who  
happen to need a combination they didn't anticipate, whether it's for  
a language that wasn't on their list, or some technical usage that  
adds diacritics to arbitrary base letters.


More information about the XeTeX mailing list