[XeTeX] An (almost) complete cyrunicode.tex

Alexej Kryukov anagnost at yandex.ru
Fri Jun 29 19:37:12 CEST 2007


On Friday 29 June 2007 08:45, Evgenie Medvedev wrote:

> Besides, X2 encoding (which LaTeX manual says is
> intended to support all Cyrillic languages supported at all, and
> which I based this on) includes some things I simply could not locate
> in Unicode despite my best efforts

Indeed, X2 does have some characters not (yet) available in Unicode. I
don't know, who is wrong here (Unicode or X2), but, anyway I think all
your identifications are correct. I just would note that so-called
"Cyrillic epsilon" looks very similar to REVERSED ZE U+0510/U+0511 and
should probably be identified with it. A few other comments:

-- I think you should add descriptions for \CYRFITA and \cyrfita,
mapping them to U+0472/U+0473. These commands may be not available in
X2 because of their unification with CYROTLD, but they are listed e. g.
in OT2 (actually the only legacy TeX encoding which can correctly
handle the old Russian orthography);

-- Arthur correctly pointed out, that Cyrillic Q and W have been
recently proposed. So I think you can already map them to the codepoints
specified in n3194.pdf, as these codepoins are very unlikely to be 
changed before the final adoption of the proposal;

-- the correct mappings for angle brackets should probably be U+27E8 and
U+27E9 rather than U+3008 and U+3009. Using the later two codepoints in
the context of European scripts is discouraged, because they are
intended for double-width CJK punctuation characters. There is also
a similar pair at 2329/232A, but, again, it has compatibility
decompositions to U+3008 and U+3009 which makes this pair hardly usable 
in non-CJK contexts.

> 	There was also \textcompwordmark (no idea what it is) and the extra
> accents X2 defines, which I'm not knowledgeable enough to tackle
> properly.

\textcompwordmark is an invisible character used to break ligatures.
It is normally identified with ZERO WIDTH NON_JOINER U+200C, but,
anyway, you should not care about this character as it has nothing
specifically Cyrillic.

-- 
Regards,
Alexey Kryukov <anagnost {at} yandex {dot} ru>

Moscow State University
Historical Faculty


More information about the XeTeX mailing list