[tex-hyphen] language tag for Serbian (Serbo-Croatian?) patterns

Arthur Reutenauer arthur.reutenauer at normalesup.org
Wed Jul 20 18:04:58 CEST 2011


	Hi Jonathan,

  Two issues here:

  1. There are several sets of hyphenation patterns for the different
languages of the Serbo-Croat diasystem; specifically: for Croatian in
the Latin script; for Serbian in the Latin script; for Serbo-Croatian in
the Latin script; and for Serbo-Croatian in the Cyrillic script.  The
language names are taken from the pattern files themselves, as we didn't
want to change the way their authors described them.  The Cyrillic
Serbo-Croatian pattern set is the oldest, dating back to 1990, and is
still maintained by its original author, Dejan Muhamedagić.  All the
other pattern sets derive from them to some extent, apart for the
Croatian ones.  While it is true that we could consider the
"Serbo-Croatian" patterns as a (slightly) different set of patterns
covering the same language as the "Serbian" ones (whatever we choose to
call the language), we didn't want to rename the former because there
were already patterns labelled as Serbian when Mojca and I took over the
hyphenation patterns in 2008 (also, Dejan suggested we kept the name,
both for historical reasons and because the patterns could be useful
for, say, Bosnian, that didn't have dedicated patterns then -- and still
doesn't today).

  2. The "sh" language code is indeed deprecated in ISO 639-1 if I'm not
mistaken, but it is still a valid BCP 47 language subtag (in fact, it
was deprecated for a while, but has been "un-deprecated" in the mean
time).  Since we want to follow BCP rather than ISO 639, we can validly
use it (no one of the existing ISO 639 parts would be any good for us,
since we need a greater level of detail than individual languages; I
understand that ISO 639-6, if it came to existence, would cover our
needs, but I read that its development had stalled).

	Arthur


More information about the tex-hyphen mailing list