[XeTeX] combining characters in isolation

David Perry hospes.primus at verizon.net
Sun Jul 11 14:57:06 CEST 2010


Mike,

> character.  The characters happen to be in the Bengali block of Unicode, 
> but I suppose it's the same with any combining character.
Not necessarily.  I don't know whether you are on Windows, Linux, or 
Mac, but I can tell you that on Windows Uniscribe allows one to insert 
combining characters used (mainly) with western languages, i.e. those in 
the U+0300 range, without forcing a dotted circle to appear.  (But some 
fonts are programmed to add a dotted circle.)  With Arabic and Indic 
languages, however, Uniscribe is quite strict about putting in the 
dotted circle.

> I've tried putting a ZWNJ, ZWJ, ZWSP, and CGJx; none of them work.  I 
> either get a rectangular box or a dashed circle to the left (or in one 
> case, to the right) of the combining character.
The Unicode Standard says (2.11) that one should use U+00A0 NO-BREAK 
SPACE as a base to display combining marks.  So you might try that.  I 
suspect, however, that unless the font maker has specifically included 
this option, you'll end up with the dotted circle, at least on Windows.

But: since you posted this to the XeTeX list, I'll assume that's what 
you're using (so my comments about Uniscribe won't apply since XeTeX 
uses the ICU renderer).  I just did a quick test with a file processed 
with XeLaTeX on Windows, and the following lines worked (with no dotted 
circles):

{\fontspec{Arial Unicode MS} \char"09C2  xx \char"09C4}

{\fontspec{Arial Unicode MS} \char"00A0\char"09C2  xx  \char"00A0\char"09C4}

(I don't know BEngali and have no idea about what characters interact 
how, so I threw in the x's just to get some separation between the 
Bengali combining marks.  U+09C2 and 09C4 are Bengali combining marks 
that I chose at random since I don't know exactly what you need.  I used 
Arial Unicode MS only because I was sure it had the Bengali characters.)

HTH - David



More information about the XeTeX mailing list