[tex-hyphen] Names of files in OFFO

Barbara Beeton bnb at ams.org
Thu Mar 17 20:01:36 CET 2016


On Thu, 17 Mar 2016, Arthur Reutenauer wrote:

    On Thu, Mar 17, 2016 at 02:20:02PM -0400, Barbara Beeton wrote:
    > okay.  then there *are* two entries for
    > every possibility

      Yes, indeed.  I never said anything to the contrary.  There can even
    be more than two entries if there are several characters with oxia-tonos
    in the pattern (rare).  We do the same for some variants of sigma
    (generating even non-final sigmas in final position, strange).

      I did moot the idea of enhancing the hyphenation algorithm with an
    equivalence table (I came across several use cases that would benefit
    from that), but someone would need to work on it, of course.

there's always more work for "someone".

    >                    although only the
    > ones with oxia would be needed for
    > "properly encoded" classical greek

      Sorry, no.  Ancient Greek is no more "properly encoded" using the
    characters in the U+1F00 range than with the ones in the block starting
    at U+0370.  According to everything I've heard on the subject, encoding
    two series of characters for that diacritic, whatever we call it, was
    simply a mistake and they should really be considered completely
    equivalent (that's why they're canonically equivalent in Unicode, of
    course).

hmmm.  okay.  (i believe i can find some
instances of purported "canonical equivalence"
for math characters that really aren't, but
that's a project for another day.)  the
unicode guys do a great job, but they have
some blind spots.  (probably less so for
languages than for math, but still ...  and
there's still the problem of putting diacritics
properly on math symbols; even markup with
mathml doesn't solve all problems.)

    > i've had it bookmarked for over a week,
    > ever since i got an inquiry regarding
    > the source of several symbols in the
    > "miscellaneous symbols" block.  i'll
    > go back and reread the discussion.

      If you can find it, I'm a little curious :-)

to satisfy your curiosity, i think this
is probably the best place to join the
thread:

  http://unicode.org/pipermail/unicode/2016-March/003462.html

the threading is good on this topic, and
both "previous" and "next" pointers work
well as the subject text keeps changing.
(and i suspect you really don't care about
unicodes for go counters, but that's what
launched the subsequent questions about
gaps in blocks, where to find equivalence
information.  and nobody has brought up
a problem that i think still exists,
namely that one of the gaps alluded to
is one position too "short" so that an
attempt to identify equivalences is
doomed to failure.)
					-- bb



More information about the tex-hyphen mailing list