[tex-hyphen] Names of files in OFFO
claudio.beccari at gmail.com
Sat Mar 12 22:36:14 CET 2016
OK. You say you drove me where you wanted. I am glad you did, but at the
same time I am glad that the hyphenation problem has been cleared out
and the two sets of hyphenation patterns are so different that they
cannot be mergend in one file. What are their names is completely
indifferent to me, except for what concerns maintainance since when I
upload upgrades I have to know how to call them.
On 12/03/2016 21:51, Arthur Reutenauer wrote:
> On Sat, Mar 12, 2016 at 01:24:51PM +0100, Claudio Beccari wrote:
>> Well, the linguistic differences are minimal as well as the differences
>> beween American and British are minimal
> Hurrah! I got you exactly where I wanted. Maieutics works after all :-)
> What you just replied to are of course the exact words of the email I sent on
> Thursday evening, up to a "that", an "and", and a comma. You reacted quite
> strongly to that, and I'm sorry that I offended you, that wasn't my intention.
> I'm also sorry that I seemed to ignore many of the interesting comments you
> made in the emails you sent in the mean time, but I had to stay focused and
> drive my point home. I'll now come back to them.
> On Thursday evening I was trying to point out that there were several layers
> in the way Latin is currently supported in TeX, and you're absolutely right
> that there is a similar situation for English, except that for English it's
> even more complicated. I'll try to explain the situation in reply to Barbara's
> email, but for the time being I have to stick to Latin, there is already a lot
> to say.
> By now we have very well established that the sets of hyphenation patterns
> you created are best referred to as "phonetical" and "etymological" rather than
> "modern or medieval" and "classical". It's also essential to point out that
> they have been created with specific use cases in mind, the three ones I
> described in my email from Friday 13:24 UTC.
> The latter point is important because it defines the real-life scenarios that
> are supported by the current setup. It would be insane to try and support all
> possible combinations of the different options that are available.
> With that in mind, what should we call the different options we have?
> The naming scheme we use for hyphenation patterns is the IETF's BCP 47
> standard (https://tools.ietf.org/html/bcp47), that is both strictly defined and
> flexible enough to distinguish between all the different variants we need to
> label: we can use codes for languages (according to ISO 639), writing systems,
> also known as scripts (ISO 15924), and countries (ISO 3166). There are also a
> number of specially defined subtags to distinguish other variants, for example
> the different types of German spellings, or the polytonic and monotonic
> orthographies of Modern Greek (the full list of all subtags is maintained at
> http://www.iana.org/assignments/language-subtag-registry). Finally, it also
> allows us to define private subtags (prefixed by 'x'), like we have done for
> your newest Latin hyphenation patterns.
> I think that the two sets of patterns we have should actually be tagged using
> private subtags within a namespace, for example la-x-hyphtex-phonetic and
> la-x-hyphtex-etymological. The shaping engine Harfbuzz uses the same approach.
> Of course, we need to retain the simple tag "la" for the former pattern set, so
> we should ideally having an aliasing system, which is not available at the
> moment, so I'm not suggesting we make any change now.
> The actual language variants (classical, medieval, modern) are another
> problem. On the face of it, they're three successive stages of the evolution
> of Latin, differing in pronunciation, vocabulary, morphology, syntax. There
> is, however, a major complication because they're only known through their
> written form and are hard to define. In practice, the only issues that matter
> for typography are differences in spelling (u/v, i/j, etc.) It thus seems that
> in order to name these different variants it would be best to actually stick to
> the orthographical features than define them, rather than use chronological
> qualifiers such as "classical" or "modern". I don't actually have a concrete
> solution to this problem at the moment; however, the discussions I've had to
> attempt to classify the variants of Latin for typesetting have led me to
> believe that this is the best approach.
> This of course doesn't change anything to the fact that the end user should
> actually only see simple, "top-level" option such a classical or modern.
More information about the tex-hyphen