[tex-hyphen] tug.ctan.org upload: Hyphenation Patterns in UTF-8
Vladimir Volovich
vvv at vsu.ru
Sun Jun 29 22:27:15 CEST 2008
"MM" == Mojca Miklavec writes:
>> --- /src/TeX/texlive-svn/Master/tlpkg/tlpsrc/hyphen-russian.tlpsrc
[...]
>> What about that? The package ruhyphen ships loads of files:
[...]
>> Can we delete the whole package? Are these files still necessary?
>>
>> I.e., are all these files *ONLY* for the patterns, and not for
>> actual processing later on?
MM> I'm not sure. I need to take a look. We do not depend on these
MM> files, but maybe there are some useful macros inside, and I suspect
MM> that user is able to modify a file to specify which encoding to use
MM> for patterns. There is one default, but we do not provide the same
MM> mechanisms for switching the encoding. It's similar to Bulgarian
MM> (which only ships with support for a single encoding by default),
MM> but it might be that people need other files at runtime as
MM> well. Arthur did the conversion, but ... I would leave the package
MM> there. I'll add the dependency.
first of all, i'm happy to see this effort on cleaning up the
hyphenation patterns.
a few notes regarding russian patterns, comparing what is currently in
texlive repository in the hyph-utf8 and ruhyphen packages:
* ruhyphen package provides 7 different variants of patterns, made by
different people: ruhyph{al,as,ct,dv,mg,vl,zn}.tex
with the default ruhyphal.tex as giving probably the highest quality
of hyphenation (but some people may prefer other patterns, that's why
all of them are included into the ruhyphen package).
hyph-utf8 contains just ruhyphal; i don't know if/how it is possible
to include several pattern variants for one language (making it an
option to select the pattern to the user).
* hyph-ru.tex includes just ruhyphal.tex re-encoded from koi8-r to
utf-8, with some additional comments. but the patterns in ruhyphen
package include more than that - see ruhyphen.tex
namely, the important "missing bits" are:
- additional patterns with the "cyrillic letter yo" contained in the
file cyryoal.tex (they could be appended to the hyph-ru.tex, i guess)
- additional patterns present in ruhyphen.tex in two lines with
\patterns macros (they could also be appended to the hyph-ru.tex)
- patterns generated by hypht2.tex which is similar to hypht1.tex
("\input hypht2" is present in ruhyphen.tex). i don't know what is
the best way to include them. probably, if you want, i can provide
"flat" file (not macro-generated) with patterns generated by
hypht2.tex which are relevant to the russian language. then it could
also be appended to hyph-ru.tex
Best,
v.
More information about the tex-hyphen
mailing list