[XeTeX] Greek Hyphenation (monotoniko)
ycodet at club-internet.fr
Mon Jan 9 16:54:47 CET 2006
Le 9 janv. 06, à 15:19, Jonathan Kew a écrit :
>>> * use U+2060 WORD JOINER as compound word mark, not the letter "v"
>> I thought "v" was for digamma in Claudio Beccari's file :) What is
>> the use of a compound word mark in Greek?
> Oh! I was going by the comments at the top of the file (as I don't
> really know anything about Greek). Also, would digamma typically be
> found in modern Greek? I thought it was an archaic letter, so wouldn't
> expect it to be included at all in a hyphenation file intended for
> modern monotonic text.
You are right (I had not looked at the beginning of the file yet). I
still do not understand what that character is meant for; maybe it has
its use in modern Greek. Beccari's file works for ancient Greek as
well, I often used it in the past.
> They'd better be declared as "letters" from the point of view of TeX's
> hyphenation routine, which means they need to have catcode 11 or 12,
> and non-zero \lccode. Otherwise they'll break words up and hyphenation
> won't be applied to the proper complete sequences.
> So I think the right thing to do is to ensure \lccode<char> = <char>
> for each of these diacritics, and include hyphenation rules for both
> the precomposed and decomposed representations. (Remember that
> regardless of which form you happen to use when you type, with the
> particular keyboard layout you like to use, you might also get text
> from other sources that uses a different encoding form. Or text that
> you originally typed using combining diacritics might go through some
> other process that applies NFC normalization.)
At first sight, there is no need of rules about precomposed characters,
but maybe there are some cases which I am not thinking of.
Nobody has an idea about these rules (in Beccari's file)?
% Initials with spirits
.<a2 .>a2 .<a|2 .>a|2 .<'a2 .>'a2 .<'a|2
.>'a|2 .<~a2 .>~a2 .<~a|2 .>~a|2
.<e2 .>e2 .<'e2 .>'e2
.<h2 .>h2 .<h|2 .>h|2 .<'h2 .>'h2 .<'h|2
.>'h|2 .<~h2 .>~h2 .<~h|2 .>~h|2
.<i2 .>i2 .<'i2 .>'i2
.<o2 .>o2 .<'o2 .>'o2
.<u2 .>u2 .<'u2 .>'u2
.<w2 .>w2 .<w|2 .>w|2 .<'w2 .>'w2 .<'w|2
.>'w|2 .<~w2 .>~w2 .<~w|2 .>~w|2
I do not see which such hyphenations should be prohibited.
More information about the XeTeX