[XeTeX] Case changing for Greek
BPJ
bpj at melroch.se
Mon May 18 12:21:22 CEST 2015
Den 2015-05-17 12:10, Bruno Le Floch skrev:
> On 5/7/15, Apostolos Syropoulos <asyropoulos at yahoo.com> wrote:
>> That is correct Jonathan. In fact the general rule is that a σ at the end of
>> a word becomes always a ς. The only exception is when the final vowel is
>> cut due to a grammatical phenomenon that occurs is the following:
>>
>> σώσ' τα (save them)
>
> This case seems really hard to detect. The Unicode definition (that a
> sigma is final if the last non-Case_Ignorable character is Cased and
> the next is not) wrongly considers the second sigma in your example as
> a final sigma.
Moreover Unicode has decided to keep apostrophe and English-style
closing single quote unified, so there is no way to set up a set
of punctuation marks which should or should not trigger final
sigma, since apostrophe and closing single quote would fall in
different sets. Luckily Greek uses guillemets, but for Greek
embedded in English text all bets are off! (Swedish fortunately
knows a style with »…» as outer quotes and ”…” as inner quotes.
It's seriously old-fashioned but I use it for text with embedded
Greek whenever I can, not because I was aware of this issue --
although it can in principle happen in Ancient Greek verse -- but
because single quotes don't mix well with breathings!) I wonder
how often a closing quote is not followed by another punctuation,
statistically speaking.
>
> Perhaps we need an explicit way to say that a given sigma is final or not.
As I can see it there are three possible solutions:
* An 'uppercase final sigma' which looks identical to the
ordinary uppercase sigma.
* Disunifying apostrophe and single quote.
* Putting some suitable non-spacing character between a
non-final sigma and an apostrophe, to preserve the distinction in
case roundtripping. This is perhaps the most realistic, as anyone
can just start using it right away.
The problem with all three is that most people won't do it, since
for most people what looks the same is the same!
As for the Unicode definition of a final sigma it is IMO
deficient: clearly the last preceding character which is not a
combining mark must be *a Greek letter*. 'Finalizing' a sigma
after non-Greek letters just doesn't make sense, and quite
obviously a sigma before any of the Greek number marks or a hyphen
should not be final. Clearly they have not consulted any
classicists or comparative philologists when making that
definition! :-) As so often no algorithm is going to make
proofreading unnecessary (Which is good for me, professionally
speaking! :-)
/bpj
More information about the XeTeX
mailing list