[tex-hyphen] International romani - mid-word exclamation mark

Jonathan Kew jfkthame at googlemail.com
Sat Feb 18 14:52:39 CET 2012


On 17 Feb 2012, at 22:27, David Gardner wrote:

> Greetings hyphenation experts...
> I'm trying to get xetex to accept some very very alpha hypenation
> patterns for the internaltional Romani alphabet. It seems to get more
> complex the more I look at it.
> This is basicaly a latin script with diacritics to mark stress and
> unusual vowels, θ, ʒ for some cross dialect morpho-phonemic
> skulduggery and (to mark vocatives) mid-word exclamation points.  This
> latter is my current source of pain.
> 
> The position of the exclamation mark can be before or after the
> vowel,(e.g. "rromál!len", or "grást!a").  so sometimes it is a good
> hyphenation point, but not always.
> 
> At the moment initex is moaning about non-letters in the hyphenation
> pattern. I presume I need to change the catcode of ! to solve that  -
> would that be in the hypenation file, or somewhere else?

No, what's important here is to give it a non-zero \lccode, because (xe)tex wants to be able to (in effect) apply \lowercase to text before looking up hyphens.

So try setting \lccode`\!=`\! before loading the patterns.

(You can do this in a "loader" file so that you don't clutter the actual patterns with extra tex commands; look at how things are set up in the hyph-utf8 package.)

> 
> But also, is it possible for XeTeX to count ! as a letter when
> mid-word and punctuation (so that the spacing rules work) word
> finally? Would I need to make it active and do something clever, and
> if it is active, does that break hyphenation?

I think you can leave its \sfcode (which is what influences the sentence-final spacing) at 3000 or whatever the default is; that's independent of the (\catcode and) \lccode value that is important for whether it's considered part of the "word" (for hyphenation purposes).

JK




More information about the tex-hyphen mailing list