[tex-hyphen] [tl2007] languages

Mojca Miklavec mojca.miklavec.lists at gmail.com
Tue Jun 30 19:37:29 CEST 2009


On Tue, Jun 30, 2009 at 18:20, Martin Schröder wrote:
> Hello,
> I've been tasked to add support (hyphenation patterns for pdfTeX) for
> some additional languages to our TL2007 installation:
>  Cestina
>  Estonian
>  Magyar
>  Latvian
>  Lithuanian
>  Polski
>  Slovenian

That one is already included, it's just that it's called "slovene".

>  Slovaque
>  Bulgarian
>  Romanian
>  Russian
> Some of these are trivial (e.g. polski) since they are already
> supported in TL2007; others (e.g. Lithuanian) are not included in
> TL2007 but doable, and two are difficult: Latvian & Slovaque.
> - Latvian is only supported recently with hyph-utf8 ?

The patterns never made it into TeX package so far. They only existed
for OpenOffice and we took them from there. You don't need to switch
to hyph-utf8 in order to be able to use those patterns. Just take
- hyph-lv.tex
- loadhyph-lv.tex
- conv-utf8-l7x.tex
and put loadhyph-lv.tex into language.dat. That's all that needs to be
done ... without mentioning a minor problem.

TL 2007 most probably doesn't support L7x. On one hand it's easy to
add a few additional files to support L7x encoding per-se, but on the
other it must be a pain to support different fonts.

> - for Slovaque there is csplain/base/skhyphen.tex, but loading this in
>  language.dat leads to errors. Is there some additional magic needed
>  or is it just broken?

I would say the same ... you can just take
- hyph-sk.tex
- loadhyph-sk.tex
- conv-utf8-ec.tex
and then put loadhyph-sk.tex into language.dat

> Is there an easy chance to get these working with pdftex and TL2007? I
> don't want to switch to hyph-utf8 unless necessary.
>
> Another question: Is there a list of fontencodings to use with these
> languages somewhere?

We have a database of languages here:
   http://tug.org/svn/texhyphen/trunk/hyph-utf8/source/generic/hyph-utf8/languages.rb
   http://www.ctan.org/get/language/hyph-utf8/source/generic/hyph-utf8/languages.rb
or you can simply take a look at loadhyph-xx.tex for the specific
language. Many languase support both ec and texnansi, but the
distinction is usually not relevant as all the characters that matter
match in both encodings. Some languages support (part of) OT1, but
that's probably not worth mentioning.

Cyrilic languages are a problem on its own. Russian is supposed to
support some five different encodings and five different patterns.

Mojca


More information about the tex-hyphen mailing list