[tex-hyphen] Names of files in OFFO
Arthur Reutenauer
arthur.reutenauer at normalesup.org
Fri Mar 11 10:59:53 CET 2016
On Fri, Mar 11, 2016 at 02:03:24AM +0100, Mojca Miklavec wrote:
> The whole discussion is about how
> to best support derived projects.
The way I see it, it's more about how external projects can support
the full set of language and language variants we offer.
> From my understanding OFFO is targeting hyphenation of web pages.
It's first and foremost a set of hyphenation patterns for Apache's
XSL-FO processor.
> I thought that perhaps tags like mul-Ethi or la-x-classic would not be
> supported (they probably wouldn't be in a POSIX locale), but according
> to the links below that's apparently not the case and *exactly* the
> same standard is being used:
>
> https://www.w3.org/International/questions/qa-html-language-declarations#langvalues
> https://www.w3.org/International/articles/language-tags/
Yes, of course, BCP 47 is a standard of the IETF, the Internet
Engineering Task Force. But that's not directly relevant, what's more
important is how XSL tags languages, and it seems it only supports ISO 639,
with country and scripts in a different field (following ISO 3166-1 and
ISO 15925 respectively); even for languages it only supports ISO 639-1
and 639-2, not -3: https://www.w3.org/TR/xsl/#language
> Claudio: a colleague of mine recently registered three language
> subtags. You can check
> http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
> You'll find entries like "bohoric", "dajnko", "metelko". Those tags
> are way less important than Classical Latin and they were successfully
> registered without any problems.
But the language we call "Classical Latin" here is not very well
defined and I can guarantee you there will be problems.
> (I'm sorry if I was a bit [too] harsh. That wasn't my intention, but I
> really wanted to encourage you to try to officially register a new
> subtag or alternatively at least help us write the justification /
> explanation / proposal. The next question might arise whether we also
> need a special ISO 3166 country code or subtag for the Roman Empire :)
That would be fun :-) but is outside the scope of ISO 3166 (part 3,
for formerly used names of countries, only defines codes for countries
that had been in part 1 and were deleted).
> (PS: I still wonder why we keep using all-lowercase names for a
> standard where capitalization matters; just to help confuse other
> users even more.)
No, it doesn't matter. BCP 47 is and has always been
case-insensitive, that's precisely why I chose to write the tags
all-lowercase. The different ISO standards it uses have different
conventions (lowercase for 639, uppercase for 3166, titlecase for
15924), but even those use case-insensitive matching for all I know.
Best,
Arthur
More information about the tex-hyphen
mailing list