[XeTeX] xunicode questions

David J. Perry hospes.primus at verizon.net
Mon Jun 15 04:05:30 CEST 2009


Hi all,

As someone with an extensive background in Unicode but relatively new to 
(Xe)TeX, I have been working to understand exactly what xunicode does.  I 
cannot find any documentation except the very short readme file on CTAN, and 
for a non-programmer the source code of xunicode.sty is rough going.  (I'd 
be happy to help write some docs once I get it figured out!)

I think the following is true, based on tests I have done, but please let me 
know if it's not:

xunicode takes characters entered through the traditional TeX keystrokes 
(\'e for e-acute, etc.) and places the precomposed Unicode character (U+00E9 
for e-acute) in the output file.  (In other words, it doesn't really matter 
whether you enter e-acute directly, if you have a keyboard that supports it, 
or using the TeX keystrokes; you still get the Unicode precomposed 
character.)   If the combination that you type using traditional TeX methods 
does not exist in Unicode in precomposed form (e.g., \v y, since y-caron is 
not a precomposed combination), xunicode inserts the combining mark after 
the base letter.  I also notice that if I enter 'e' followed by \char"0301, 
the combining mark remains (i.e., xunicode does not replace this sequence 
with the precomposed version.)

I have not been able to access IPA characters successfully using the tipa 
keystrokes.  (I am familiar with tipa.sty, which I assume one would not load 
in the preamble because one does not want to use the older, non-Unicode IPA 
fonts; but perhaps that's wrong).  What am I missing?  I think I need to put 
the keystrokes inside \tipatext{ } but that doesn't work.

I can see how to enter circled letters and numbers (not useful to me, but at 
least I got that part).  There are a large number of other Unicode 
characters (not combining accents), which seem to be accessed by typing 
their names (\textwynn, \NG); is that right?

Some Unicode characters are in xunicode.sty but are commented out; why? 
They are as valid as any other Unicode char, AFAIK.

Finally, I noticed that xunicode provides access to some characters in the 
Private Use Area, specifically the old style numbers.  This surprised me, 
since XeTeX is one of the few applications that allows users to access 
alternate number shapes via OpenType or AAT features.  Users therefore have 
no need to use these PUA values, and using them is not a good practice. 
(Adobe, for instance, is removing the PUA assignments from oldstyle numerals 
and small caps as they release new versions of their fonts.)    Perhaps such 
PUA items should be removed from future versions of xunicode.

Thanks - David 



More information about the XeTeX mailing list