[tex-hyphen] Finnish basic hyphenation rules
Arthur Reutenauer
arthur.reutenauer at normalesup.org
Thu Apr 16 22:08:06 CEST 2020
Hi Teemu,
Glad to know that you’re happy with the patterns.
On Thu, Apr 16, 2020 at 06:51:57PM +0300, Teemu Likonen wrote:
> I have been thinking and testing these new Finnish basic hyphenation
> patterns and it looks to me that they are ready. They do what anybody
> would expect. The current (old) Finnish patterns aim for good typography
> (it seems), on average, but give sometimes unexpected output if one
> wants to fully control hyphenation. Our new patterns give a nice
> alternative by giving expected basic hyphenation.
I have been wondering about the name. Calling it “basic” seems a
little too ... well, basic ;-) The main rule reminded me of the Swedish
“one-consonant principle” (https://sv.wikipedia.org/wiki/Avstavning#Enkonsonantsprincipen):
perhaps we could call it “onecons”? If we want the tag to be BCP 47-compliant
it has to be less than 8 characters, hence fi-x-onecons would fit well.
> There are a couple of ambiguous cases where an automatic hyphenation
> can't know where is a diphthong in a certain three-letter series. "haku
> : ha-uissa" or "hauki : hau-issa". Both have "aui". Another is "ruko :
> ru-oissa" and "ruoka : ruo-issa". These will not be hyphenated between
> any of the vowels. Typography prefers not to break between vowels
> anyway. No problem.
Yes, that’s always to be expected, with any language.
> Maybe it's time to proceed. Can these new patterns be integrated to TeX
> system? My powers and current skills are not enough for that.
> Polyglossia has language-specific options. Is it possible to utilise
> those for choosing a hyphenation variant? Something like this:
>
> \setdefaultlanguage[hyphenation=typographic]{finnish} % default
> \setdefaultlanguage[hyphenation=basic]{finnish} % new
>
> Everything but the hyphenation patterns would be the same.
>
> I found that in Babel there are modifiers like this:
>
> \usepackage[finnish.basichyph]{babel}
> \languageattribute{finnish}{basichyph}
>
> But what do I know about the practices and implementation. This is just
> a user thinking out loud.
Your opinion does matter. The exact names of the options and
attributes can change, but the above suggestions look sensible. I’ll
implement something like that in Polyglossia in the next week or two,
and will also add the patterns to hyph-utf8.
Best,
Arthur
More information about the tex-hyphen
mailing list.