[XeTeX] Case changing for Greek

BPJ bpj at melroch.se
Mon May 18 12:21:22 CEST 2015


Den 2015-05-17 12:10, Bruno Le Floch skrev:
> On 5/7/15, Apostolos Syropoulos <asyropoulos at yahoo.com> wrote:
>> That is correct Jonathan. In fact the general rule is that a σ at the end of
>> a word becomes always a ς. The only exception is when the final vowel is
>> cut due to a grammatical phenomenon that occurs is the following:
>>
>>   σώσ' τα (save them)
>
> This case seems really hard to detect.  The Unicode definition (that a
> sigma is final if the last non-Case_Ignorable character is Cased and
> the next is not) wrongly considers the second sigma in your example as
> a final sigma.

Moreover Unicode has decided to keep apostrophe and English-style 
closing single quote unified, so there is no way to set up a set 
of punctuation marks which should or should not trigger final 
sigma, since apostrophe and closing single quote would fall in 
different sets.  Luckily Greek uses guillemets, but for Greek 
embedded in English text all bets are off! (Swedish fortunately 
knows a style with »…» as outer quotes and ”…” as inner quotes. 
It's seriously old-fashioned but I use it for text with embedded 
Greek whenever I can, not because I was aware of this issue -- 
although it can in principle happen in Ancient Greek verse -- but 
because single quotes don't mix well with breathings!) I wonder 
how often a closing quote is not followed by another punctuation, 
statistically speaking.

>
> Perhaps we need an explicit way to say that a given sigma is final or not.

As I can see it there are three possible solutions:

*   An 'uppercase final sigma' which looks identical to the 
ordinary uppercase sigma.

*   Disunifying apostrophe and single quote.

*   Putting some suitable non-spacing character between a 
non-final sigma and an apostrophe, to preserve the distinction in 
case roundtripping.  This is perhaps the most realistic, as anyone 
can just start using it right away.

The problem with all three is that most people won't do it, since 
for most people what looks the same is the same!

As for the Unicode definition of a final sigma it is IMO 
deficient: clearly the last preceding character which is not a 
combining mark must be *a Greek letter*. 'Finalizing' a sigma 
after non-Greek letters just doesn't make sense, and quite 
obviously a sigma before any of the Greek number marks or a hyphen 
should not be final. Clearly they have not consulted any 
classicists or comparative philologists when making that 
definition! :-)  As so often no algorithm is going to make 
proofreading unnecessary (Which is good for me, professionally 
speaking! :-)

/bpj



More information about the XeTeX mailing list