[XeTeX] An (almost) complete cyrunicode.tex

Evgenie Medvedev medvedev at project7.ru
Sat Jun 30 02:54:50 CEST 2007


Nikola Lecic wrote:

> Despite your own explanation of
> the origin of Cyrillic (which is well known to all us Cyrillic users),
> you argue that Russian+Slavic (isn't Russian Slavic?) is the most
> accurate division because (modern) Russian character set happened to be
> the first in Unicode.

	...and iso-8859-5, and cp1251, and cp866 before that, and koi8-r even
before that... So what?
	I happen to have heard a horror story about the current use of 8-bit
codepages for Azerbaijanian, from a friend who had to make Oracle  talk
with a legacy 8-bit application. Neither Cyrillic codepages commonly
used before nor the current Latin ones define a \cyrschwa, so they
replace it with something else... and it is several something else's, so
you never know which one was used until you trip over it. :)
	Based on that, my take on how would other Cyrillic users check to see
if the set is complete and correct for their language... First they
would see if the continuous stretch is present and working - because it
has always been present and working whenever they worked with their
language in an 8-bit codepage, all of them include it. Then they would
look for 'problem' letters, which are present in their particular
language, but outside the continuous stretch, and might have been
missing in an 8-bit codepage they commonly see. Then it's on to letters
which aren't in any common 8-bit cyrillic codepages.
	That the continuous stretch happens to include most Russian letters is
irrelevant really -- it doesn't even include them all! Where's "Ё"?
Outside. And it was missing or misplaced in early versions of cp866, by
the way. It's not about linguistics, it's about deficiencies in
software, and a software-centric approach to listing definitions is the
most sensible one, IMHO.

	Here's the new version, and we've already spent more time on arguing
than we should have. :)

-- 
Evgenie Medvedev
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cyrunicode.tex
Url: http://tug.org/pipermail/xetex/attachments/20070630/0f07f853/attachment-0001.pl 


More information about the XeTeX mailing list