[tex-hyphen] More on hyphenating Ancient Greek.
claudio.beccari at gmail.com
Thu Nov 13 11:24:50 CET 2014
The problem is not LaTeX, but the program used for transforming XML
source into LaTeX code.
The additional characters and the scholarly emendations are dealt with
by LaTeX by means of package teubner. As far as I can say the teubner
macros does not interfere with hyphenation more or less than any other
macro; in the sense that any macro may interfere with hyphenation when
the text is sent to the hyphenation algorithm still contains
unexpandable tokens. With latin script, for example, and the OT1 default
font encoding, writing \`a (or even à when using the suitable option to
the inputenc package) remains as \accent18a when the text is sent to the
hyphenation algorithm; this algorithm considers a valid word only
something made up with character tokens with a positive lccode: \accent,
1, and 8 have non positive lccode, therefore the "LaTeX word" stops
before the accented letter, and the rest of the word string is discarded
for hyphenation until a new valid word start is encountered.
This is not dependent on which typesetting engine is used (pdftex,
xetex, etc.) it depends on the hyphenation algorithm, explained in
appendix H of the TeXbook.
For what concerns Greek your problem probably persists even if you use
OpenType fonts, instead of the LGR encoded ones; with the latter ones
the round and angle brackets are mappedto other chars and interfere with
hyphenation. With OpenType fonts it is possible that assigning a
positive \lccode to round and angle brackets hyphenation is still
possible, but with unexpected results.
On 13/11/2014 10:39, Philip Taylor wrote:
> In the work in progress, various stretches of ancient Greek text have
> additional characters interpolated into them to indicate (the nature
> of) scholarly emendations made. For example, the XML input :
> <Other_Notes>f.≈Br:<image status="active" source="L40.2-G5-[B1]"
> callout="Other_Notes"></image> “<foreign language="Greek">Σωσον
> Κ<expan>ύρι</expan>ε τῶν λα<expan>ον</expan> σου καὶ ευλογησον τὴν
> κλ<supplied>η</supplied>ρονομια<supplied>ν</supplied> σου νίκας τῆς
> βα<supplied>σιλεύσι</supplied></foreign>”; ... </Otyher notes>
> will, after TeX's macro expansion, yield (in part) :
> f.(kern)Br: Σωσον Κ(ύρι)ε τῶν λα(ον) σου καὶ ευλογησον τὴν
> κλ<η>ρονομια<ν> ...
> Empirically, it would seem that the presence of the interpolated round
> and angle brackets affects TeX's ability to hyphenate such stretches
> of text; could the hyphenation experts suggest a work-around, please ?
> ** Phil.
More information about the tex-hyphen