# [tex-live] Hyphenation patterns, Unicode, XeTeX, and language.dat

Ralf Stubner ralf.stubner at physik.uni-erlangen.de
Fri Aug 18 10:59:23 CEST 2006

Jonathan Kew <jonathan_kew at sil.org> writes:

> However, it would mean providing a prefixed "wrapper" for *every*
> pattern file, even those like hyphen.tex or ukhyphen.tex which are
> pure ASCII (or Latin-1 letters represented with ^^xx codes, which
> equate directly to Unicode codepoints). Close to half the pattern
> files in TL's texmf/tex/generic/hyphen are currently "safe" files of
> this nature. So I'm not sure this is really easier/better than
> changing the (single-line) language.__.dat files.
>
> (One slightly more elaborate approach, then, would be to test for the
> existence of \patternfileprefix#2, and if this is not available, load
> the original file without a prefix.)
>
> The other factor is that I've primarily been looking at solutions
> that could be implemented entirely at the TeX Live (or other
> distribution) level, without touching any of the canonical LaTeX or
> Babel files. But if there's agreement that it would be preferable to
> add such a hook, I'd be happy to go that way.
>
> Further thoughts, anyone?

Morton's suggestion together with the additional test if
\patternfileprefix#2 is available looks like the most flexible system to
me. In particular I am thinking about the greek hyphenation patterns
that can be produced via Peter Heslin's script. If I understood it
correctly, one can not map the patterns for LGR encoding to unicode
encoding in the way you do it, say, for T1 to unicode. So one would have
to use two different pattern files. That would be a quite natural thing
when working with \patternfileprefix, while the wrapper files would need
a slightly different structure:

\ifxetex
\input <unicode-patterns>
\else
\input <original-patterns>
\fi

However, I don't know if a new babel release will happen before TL 2006
is released, hence using wrapper files is probably best for the time
being.

cheerio
ralf