[tex-hyphen] Customizing patterns

Mojca Miklavec mojca.miklavec.lists at gmail.com
Fri Apr 30 15:52:49 CEST 2010


Dear Javier,

There are a few remarks (about what you sent us) that I would like to
make ... with some time delay.

1.) When there are local differences between countries, I see no
reason for not using the same strategy as Norwegians did. They have a
common file with "what holds for both variants of Norvegian" + two
separate files: one for Bokmal and the other one for Nynorsk. The two
are officially considered to be two languages, but if there are
differences between countries, that's a perfectly valid argument to
add a new language. Germans did the same for Switzerland as well.

We would name them es_ES, es_MX etc. (lowercase).

Of course one would need some extra support in babel or somewhere else
(like a special package of your own) that would be able to select the
right patterns depending on the country chosen. I think that
Polyglossia does support that for example.

2.) I would like to avoid special macros inside the pattern files. If
you do want to add some trickery with extra patterns, I would put that
into loadhyph-es.tex. Extra patterns would then go to another folder
and loadhyph would load them according to some special algorithm
(whatever that would be - most probably suggested by you). The reason
for that is that I would like to keep the pattern files themselves
clean. We may put ugly tricks to loadhyph, but hyph-xx.tex are now
being automatically converted to plain text version of patterns (one
pattern per file) and this would complicate things enormously.

In particular: in cases like you showed us:

...
plan4c5t
\if\include{A}
2no.
\fi
4caca4
...

you would put 2no. into a separate file specialhyph-es-x-a.tex and
then loadhyph-es.tex would do

\if\include{A}
\include specialhyph-es-x-a.tex
\fi

instead of "spoiling" the original pattern file.

3.) Is there any special reason (apart from local differences that
differ among countries) why you don't wan't to include ALL the
exceptions (that is - including chemistry, biology etc.) by default?
I'm also asking since we are planning to convert the same set of
patterns into a form that would be suitable for javascript, apache,
perl etc. and there's usually no possibility to do such adaptions
outside of TeX world.

Personally, I don't find any reason against including all the
exceptions apart from maybe:
- backward compatibility in exactly the same line breaks
- memory consumption (not such a big issue nowadays)

4.) There's a chance that luatex patterns will be loaded at runtime
instead of at format generation. I have no idea in what way this might
affect your desire to do changes in luatex, so it would make sense if
you could test it (extensively) before TL 2010 release.

Thanks a lot,
    Mojca

(An error report came back from your email earlier today, so I home
that this will come through to you.)

On Fri, Feb 5, 2010 at 15:41, Javier Bezos wrote:
> Hi all,
>
> Hyphenation doesn't depend only of the language, but also
> on the field (Chemistry has particular needs), the style
> and, sometimes, the country (eg, in Spanish, the group tl is
> -tl in México and t-l in Spain).
>
> At request of a subscriber of es-tex (the mailing list of
> Spanish TeX) I was investigating how to customize somehow
> the patterns. With luatex it's trivial, but not so with
> standard TeX. Now, with the suggestions and bugs from
> Rodrigo, I'll give it a push forward.
>
> I think the simplest way to carry out this is to provide
> two "languages" -- namely, the standard (and default), with
> the full set of usual patters, and a "local" set which can
> be configured locally and selected with, say, an option in
> babel or similar. There will be about 5 options and therefore
> considering all combinations as separate "languages" is
> impossible. But a single local file, so that the settings
> are writen to the log file (for reference), could be feasible.
> Let me say I'm still investigating how to do it, but perhaps
> this system will be enough in most of cases.
>
> Any thoughts?
>
> Javier



More information about the tex-hyphen mailing list