[tex-hyphen] hyphenating overview
bnb at ams.org
Sat Jun 9 15:26:11 CEST 2018
Pablo Rodriquez inquires about the hyphenation of "overview":
> I have just checked [...], that “overview”
> isn’t hyphenated at all.
This is true with the basic hyphenation patterns and the original exception list.
But see below.
> English hyphenation is a mistery to me. But I still wonder why
> “overview” isn’t hyphenated and “over-heat” is (both in US English).
The patterns were generated from a word list (a very large dictionary).
If a particular string of letters, say "ervi" appeared both with and without
a hyphen, it would likely not be added to the patterns with a hyphen.
The actual mechanism is described/explained in Frank Liang's dissertation,
"Word Hy-phen-a-tion by Com-put-er", posted as a pdf file on the TUG
website at https://www.tug.org/docs/liang/liang-thesis.pdf .
That said, "overview" itself *is* hyphenated in the print edition of
Webster's 3rd International Unabridged dictionary, which is the
authority used to check hyphenation. Exceptions are found there and
added to the list of exceptions in TUGboat, for which a supplement is
published periodically (one just appeared in TUGboat 39:1), and the
cumulative list posted on CTAN as tb0hyf.* (in both TeX and pdf form)
along with the hyphenex output, at usergrps/tug/tugboat/hyphenex .
I have just checked, and the most recent update on CTAN is from 2015.
While the failure to update more recently is surely an oversight (and
I have on my to-do list the task of updating the cumulative TeX list,
getting it processed with hypenex and posted to CTAN), the word
"over-view" has been on the list since 2005, so I can't explain why it
apparently isn't in effect if the ushyphex collection is compiled into
the version of (La)TeX that you're using. Maybe someone more
familiar with what hyphenation patterns *are* compiled into the
released TeX binaries can address that.
> Sorry, but I tend to think that there may be an error there.
> Would you be so kind to confirm my mistake in the assumption above (that
> one of the hyphenations is wrong)?
That is correct. This is a known exception.
More information about the tex-hyphen