[XeTeX] Sanskrit hyphenation

Jonathan Kew jonathan_kew at sil.org
Fri Apr 1 20:28:26 CEST 2005

On 1 Apr 2005, at 5:43 pm, Yves Codet wrote:

> Sanskrit appears twice. Maybe it's the ghost (language 5 or 6?) who is 
> responsible for the strange results we had. So it might not be a XeTeX 
> bug but simply that the hyphenation file has to be included in another 
> way. A kind LaTeX expert could probably tell us.

This is because it's not the responsibility of the patterns file to do 
\newlanguage; hyphen.cfg does that. But I don't think this is enough to 
explain the bad output.

Aha, I've got it! The chief issue is that when you set \lccode values 
in the sanhyph.tex file, you need to prefix the assignments with 
\global. This is because the whole file is read within a group, so 
without \global, the \lccode changes are lost after loading the 
patterns. And this leads to the words being broken into separate runs 
during hyphenation, and this in turn leads to the dotted circles (where 
a run has an initial combining mark), etc.

(Eventually, more of these \lccodes should be preset by 
unicode-letters.tex anyway.)

So it wasn't really a XeTeX bug after all. <sigh-of-relief> (Although I 
can see ways in which XeTeX could be improved so that the effects would 
have been less dramatically bad....but you still wouldn't have gotten 
the proper hyphenations without the right \lccodes.)

It's also true that you shouldn't assign a language code with 
\newlanguage within the sanhyph.tex file; hyphen.cfg has already 
assigned the code \l at sanskrit before it reads the file. But then you 
also won't be able to say \language=\sanskrit, as in Plain; there must 
be a "correct" LaTeX command to choose the language. But I'm not a 
LaTeX user, so I made do with \language=\csname l at sanskrit\endcsname to 
activate the patterns for testing. :)

If you modify the sanhyph.tex file in this way, you'll also need to 
modify your splain.ini or whatever you use to build a plain-based 
format, so as to allocate the language code before loading the 


More information about the XeTeX mailing list