[tex-hyphen] Names of files in OFFO
claudio.beccari at gmail.com
Thu Mar 17 19:58:56 CET 2016
The example of greek is a good one, but for qwhat concerns the TeXSystem
it is a bad one.
When unicode/utf8 engines are used the unicode encoded patterns are
available because Apostolos Syropoulos created the several years ago
(since these engine have been available) and I suppose they are OK.
At the moment the pattern files for 8-bit engines (in practice pdftex
and knuthian tex) LGR encoded greek fonts deal only with the latin
translitteration and do not deal with direct greek utf8 encoded greek
text. I preapred the necessary extensions to cope with the LICR encoding
created by Günter Milde, the actual maintainer of the pdftex+babel
related files (greek.ldf, textalpha.sty, alphabeta.sty, and several
other ones) and the uft8 direct input of the three varieties of greek:
monotoniko, politoniko, ancient; 18 months ago, more or less, I sent the
new pattern files to some greek TeXies for the necessary controls, but
up to now I did not get any feedback.
Tonos is the only accent used in monotoniko, but it generally has the
same shape as an acute/oxia one, but ins ome instances it is and
"unslanted acute" a straight stroke ove the vovel. But unicode does not
deal directly with shapes of the single glyphs, it deal with the names
and give a sample shape in order to make it clear what tha name deals with.
Obviously the tonos and the oxia may be identical in shape in most
fonts, but in some other ones they are different; and they may be so
both in self combining glyphs or in preaccented ones. Unicode has to
deal with them as two distinct gliphs.
May be hyphenation patterns for polytoniko may be considered a superset
of monotoniko, but the patterns for ancient are different, not only
becase there is a different lexicon, but also for hyphenation rules that
for ancient greek are mre etymological than for modern greek.
Therefore we have a situation similar to the one we discussed for
modern. medieval classical, ecclesiastic. latin not long ago.
On 17/03/2016 19:20, Barbara Beeton wrote:
> On Thu, Mar 17, 2016 at 01:55:27PM -0400, Barbara Beeton wrote:
> > that's all very well, and i understand
> > how *unicode* works. what i'd really
> > like to see is how this equivalence
> > is determined in a (la)tex source file.
> In the case of Greek hyphenation, by making as many copies of the
> patterns containing an oxia-tonos as is necessary. That's very
> pedestrian, but works; it's done by a script, of course.
> okay. then there *are* two entries for
> every possibility (although only the
> ones with oxia would be needed for
> "properly encoded" classical greek).
> > there has been
> > a discussion on the unicode discussion
> > list to the effect that the NamesList
> > file should *not* be used for this
> > sort of analysis.
> Well, the authoritative data is UnicodeData.txt, and it's just as easy
> to parse (easier, in fact), so that's what should be used. Do you have
> a pointer to the discussion?
> i've had it bookmarked for over a week,
> ever since i got an inquiry regarding
> the source of several symbols in the
> "miscellaneous symbols" block. i'll
> go back and reread the discussion.
> -- bb
More information about the tex-hyphen