From oinos at web.de Sat Jun 9 09:38:59 2018 From: oinos at web.de (=?UTF-8?Q?Pablo_Rodr=c3=adguez?=) Date: Sat, 9 Jun 2018 09:38:59 +0200 Subject: [tex-hyphen] hyphenating overview Message-ID: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> Dear list, I have just checked (at https://tex.mendelu.cz/en/), that ?overview? isn?t hyphenated at all. English hyphenation is a mistery to me. But I still wonder why ?overview? isn?t hyphenated and ?over-heat? is (both in US English). Sorry, but I tend to think that there may be an error there. Would you be so kind to confirm my mistake in the assumption above (that one of the hyphenations is wrong)? Many thanks for your help, Pablo -- http://www.ousia.tk From P.Taylor at Rhul.Ac.Uk Sat Jun 9 10:05:04 2018 From: P.Taylor at Rhul.Ac.Uk (Philip Taylor) Date: Sat, 9 Jun 2018 09:05:04 +0100 Subject: [tex-hyphen] hyphenating overview In-Reply-To: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> References: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> Message-ID: Pablo Rodr?guez wrote: > Dear list, > > I have just checked (at https://tex.mendelu.cz/en/), that ?overview? > isn?t hyphenated at all. > > English hyphenation is a mistery to me. But I still wonder why > ?overview? isn?t hyphenated and ?over-heat? is (both in US English). > > Sorry, but I tend to think that there may be an error there. > > Would you be so kind to confirm my mistake in the assumption above (that > one of the hyphenations is wrong)? There may well be.? With British English patterns, "overview" (as the second or subsequent word of a paragraph) is hyphenated as expected; with American English patterns, it is not : > \hsize = 0 pt > > \uselanguage {UKenglish} > > Overview overview overview > > > \uselanguage {USenglish} > > Overview overview overview > > > \end > Philip Taylor From arthur.reutenauer at normalesup.org Sat Jun 9 13:02:38 2018 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Sat, 9 Jun 2018 13:02:38 +0200 Subject: [tex-hyphen] hyphenating overview In-Reply-To: References: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> Message-ID: <20180609110238.GA3140237@phare.normalesup.org> > There may well be.? With British English patterns, "overview" (as the second or subsequent word of a paragraph) is hyphenated as expected; with American English patterns, it is not : Confirmed. The word ?overview? is hyphenated as expected (over-view) with the British English patterns, but none of the American English sets, neither hyphen.tex nor the extended set labelled as usenglishmax in language.dat. Obviously it an over-sight -- pun intended ;-) Barbara: maybe a candidate for the exceptions list? Best, Arthur From bnb at ams.org Sat Jun 9 15:26:11 2018 From: bnb at ams.org (Barbara Beeton) Date: Sat, 9 Jun 2018 13:26:11 +0000 Subject: [tex-hyphen] hyphenating overview In-Reply-To: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> References: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> Message-ID: Pablo Rodriquez inquires about the hyphenation of "overview": > I have just checked [...], that ?overview? > isn?t hyphenated at all. This is true with the basic hyphenation patterns and the original exception list. But see below. > English hyphenation is a mistery to me. But I still wonder why > ?overview? isn?t hyphenated and ?over-heat? is (both in US English). The patterns were generated from a word list (a very large dictionary). If a particular string of letters, say "ervi" appeared both with and without a hyphen, it would likely not be added to the patterns with a hyphen. The actual mechanism is described/explained in Frank Liang's dissertation, "Word Hy-phen-a-tion by Com-put-er", posted as a pdf file on the TUG website at https://www.tug.org/docs/liang/liang-thesis.pdf . That said, "overview" itself *is* hyphenated in the print edition of Webster's 3rd International Unabridged dictionary, which is the authority used to check hyphenation. Exceptions are found there and added to the list of exceptions in TUGboat, for which a supplement is published periodically (one just appeared in TUGboat 39:1), and the cumulative list posted on CTAN as tb0hyf.* (in both TeX and pdf form) along with the hyphenex output, at usergrps/tug/tugboat/hyphenex . I have just checked, and the most recent update on CTAN is from 2015. While the failure to update more recently is surely an oversight (and I have on my to-do list the task of updating the cumulative TeX list, getting it processed with hypenex and posted to CTAN), the word "over-view" has been on the list since 2005, so I can't explain why it apparently isn't in effect if the ushyphex collection is compiled into the version of (La)TeX that you're using. Maybe someone more familiar with what hyphenation patterns *are* compiled into the released TeX binaries can address that. > Sorry, but I tend to think that there may be an error there. > Would you be so kind to confirm my mistake in the assumption above (that > one of the hyphenations is wrong)? That is correct. This is a known exception. -- bb From arthur.reutenauer at normalesup.org Sat Jun 9 15:50:49 2018 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Sat, 9 Jun 2018 15:50:49 +0200 Subject: [tex-hyphen] hyphenating overview In-Reply-To: References: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> Message-ID: <20180609135049.GA3185066@phare.normalesup.org> Barbara, Thanks for checking. > While the failure to update more recently is surely an oversight (and > I have on my to-do list the task of updating the cumulative TeX list, > getting it processed with hypenex and posted to CTAN), the word > "over-view" has been on the list since 2005, so I can't explain why it > apparently isn't in effect if the ushyphex collection is compiled into > the version of (La)TeX that you're using. The list from ushyphex.tex isn?t used at all. I?m surprised you weren?t aware of that. The list of patterns referred to as ushyphmax in language.dat is Gerard Kuiken?s extension of hyphen.tex with extra *patterns* but actually no new hyphenation exceptions. I don?t know if Dr. Kuiken had a specific reason to not extend his file with the hyphenation exceptions from ushyphex; possibly because exceptions do not actually need to be dumped into the format and can thus be loaded at runtime. That was, at least, the situation as we found it ten years ago when we took over the technical side of hyphenation patterns with Mojca. Obviously we didn?t want to change the actual patterns and exceptions without someone explicitly instructing us to do so; and for most major languages, it turned out, the patterns were essentially abandoned with no one maintaining them (again, contentwise); two exceptions are German and Spanish, handled in two different ways in hyph-utf8. Best, Arthur From bnb at ams.org Sat Jun 9 16:09:20 2018 From: bnb at ams.org (Barbara Beeton) Date: Sat, 9 Jun 2018 14:09:20 +0000 Subject: [tex-hyphen] hyphenating overview In-Reply-To: <20180609135049.GA3185066@phare.normalesup.org> References: <073d29e4-4f9d-f2b3-ac86-75e9a31e5d57@web.de> <20180609135049.GA3185066@phare.normalesup.org> Message-ID: <0d5eceb90b7a4b94b0e9781802e29c18@EXC1.ams.org> thanks for the information on ushyphex and ushyphmax, Arthur. > > While the failure to update more recently is surely an oversight (and > > I have on my to-do list the task of updating the cumulative TeX list, > > getting it processed with hypenex and posted to CTAN), the word > > "over-view" has been on the list since 2005, so I can't explain why it > > apparently isn't in effect if the ushyphex collection is compiled into > > the version of (La)TeX that you're using. > The list from ushyphex.tex isn?t used at all. I?m surprised you > weren?t aware of that. The list of patterns referred to as ushyphmax in > language.dat is Gerard Kuiken?s extension of hyphen.tex with extra > *patterns* but actually no new hyphenation exceptions. I don?t know if > Dr. Kuiken had a specific reason to not extend his file with the > hyphenation exceptions from ushyphex; possibly because exceptions do not > actually need to be dumped into the format and can thus be loaded at > runtime. at the time it was compiled, the kuiken enhancement of the patterns was actually based on the then-current exception list. (somewhere in my archives I believe I still have the relevant correspondence.) but you're correct -- it's old. > That was, at least, the situation as we found it ten years ago when we > took over the technical side of hyphenation patterns with Mojca. > Obviously we didn?t want to change the actual patterns and exceptions > without someone explicitly instructing us to do so; and for most major > languages, it turned out, the patterns were essentially abandoned with > no one maintaining them (again, contentwise); two exceptions are German > and Spanish, handled in two different ways in hyph-utf8. of course this is a reasonable decision. maybe someone can be found to investigate this so an update can be considered. (I'll think hard about it, and get in touch with you off-list if I come up with anything that seems promising.) in the meantime, I'll also reread the intro text for the exception list to see whether it adequately deals with how to use what's already offered to improve results. -- bb