[tex-hyphen] Evolving usage for UK hyphenation patterns

Arthur Reutenauer arthur.reutenauer at normalesup.org
Tue Mar 20 01:57:28 CET 2018

	Dear Dominik,

  I’m really glad that you brought up this issue, because it’s something
I’ve meant to contact you about for many years, without knowing how to

  The breakpoints recommended by dictionaries published by Oxford
University Press have indeed changed over the years, but it’s much older
than 2014, it dates according to my research from the mid-1990s --
ironically around the time you were with Graham Toal doing your good
work that led to the current British English (pre-change) patterns.

  I have over the years amassed a number of OUP spelling dictionaries
(and still am not convinced I got the full picture):

  * The Oxford Spelling Dictionary, 1st ed., 1986 (hardbound)
  * The Oxford Minidictionary of Spelling and Word Division, 1st ed., 1986 (paperback)
  * The Oxford Spelling Dictionary, 1st ed., 1990 (paperback)
  * The Oxford Spelling Dictionary, 2nd ed., 1995 (hardbound)
  * The Oxford Colour Spelling Dictionary, 1996 (paperback)
  * The Oxford Minidictionary of Spelling, 2nd ed., 1997 (paperback)
  * The Oxford Spelling Dictionary, 3rd ed., 2005 (hardback)
  * The Oxford Spelling Dictionary, 4th ed., 2014 (hardback)

  To be excrutiatingly clear, the last but one entry that I call “3rd
ed., 2005” has a colophon saying “Second edition 1996 / This edition
2005”, which I can only interpret as meaning it’s a third edition, and
likewise the last entry says in the colophon “Second edition published
in 1996 / This edition published in 2005 / Reissued in 2014” which makes
nearly no sense, so that I think it’s a fair interpretation to call it a
fourth edition.

  By the way, I am not aware of a 1990 edition of the Minidictionary, I
suspect the copy you have is a 1990 reprint of the 1986 edition; mine is
a 1992 reprint (this may be relevant when trying to determine the
chronology of hyphenation).  Also, I gave my first copy of the 2005
edition to the wife of the chairman of the Board of Trustees of the
Nobel Foundation five years ago (although this is not actually relevant
at all to the discussion at hand ;-)

  In any case, there is a clear break (pun intended) in hyphenation
practice introduced with the second edition of the spelling dictionary
in 1995.  The preface even says in its penultimate paragraph, quoting
verbatim and in full:

	The recommended word divisions shown have been completely
	revised in the light of modern practice and represent an attempt
	to find the most unobtrusive solutions.  They are based on a
	combination of etymological and phonological considerations,
	since overstrict adherence to either principle can result in
	misleading or inelegant divisions, such as auto-nomous and
	lung-ing or profi-teer and overwa-ter.

  While completely unconvinced by the examples given, I wholeheartedly
agree to the principles enunciated, and consider that, generally, the
“new” hyphenations do seem to make sense.  I would like to call them
“more American” in that they seem to favour phonological breaks a little
bit more, and even deviate from etymology to an almost shocking point
with classic American-style hyphenations such as “biog-raphy” (but

  The 1995 2nd edition spelling dictionnary is reproduced exactly in
paperback as the 1996 “Colour Spelling Dictionary”, which I actually
think you have, because you mentioned in on the XeTeX list several
years, which prompted me to buy it (I’m almost certain it was you).  In
any case, since you do seem to have a copy of the 2014 edition, you
should be able to use it as reference because I can testify that all the
words mentioned by your correspondent are hyphenated exactly the same
way as in the 1995 edition (except for “indicates”, which is not present
in 2014 -- but still in 1995 -- however it seems impossible that it
would be hyphenated any differently from “indicate”).

> I'm not sure what - if anything - to do about this.

  I know: since the breakpoints have been essentially unchanged for over
twenty years, it seems to me that it would be very possible to obtain a
list of words from any point in this time span, and produce patterns
with patgen the same way you have then.  I am willing to do much of the
legwork, but would very much appreciate your assistance.



P-S: Independently from the above, I do think your correspondent is
quite confused, as many of the hyphenations he quotes (chiefly the
second half) are exactly identical to entries in American dictionaries.
Compare https://www.merriam-webster.com/ for example.

More information about the tex-hyphen mailing list