[tex-hyphen] Accuracy of the hyphenation algorithm

Arthur Reutenauer arthur.reutenauer at normalesup.org
Wed Jul 29 19:49:36 CEST 2015


> I'm guessing the bad "p-neu-mo-ni-a" may be caused by missing support for
> LEFTHYPHENMIN and RIGHTHYPHENMIN in the implementation used.
> From the top of my head, these are both atleast 2 for english.

  2 and 3, to be precise, for all pattern sets for English, of which
there are three in current TeX distributions.

  I think all sources of misunderstandings have been covered; I'd only
add that TeX finding hyphenation points such as gen·uine, while some
dictionary has gen·u·ine, is not actually incorrect: the algorithm finds
correct breakpoints, just not all of them; likewise for toothache.  It's
of course suboptimal, which is not really surprising for hyphen.tex,
that has been compiled from a rather minimal set of words.  In practice
a word will only be hyphenated in at most one place anyway!

  An actually incorrect breakpoint would of course be one that is not
present in the authoritative list, such as toot·hache, for exampl, such
as toot·hache, for example.

	Best,

		Arthur


More information about the tex-hyphen mailing list