[XeTeX] hyphenating words with a hyphen

Alexey Kryukov anagnost at yandex.ru
Sat Jan 31 19:08:11 CET 2009


Hi,

I would like to raise a question already asked here about two years
ago (see http://www.tug.org/pipermail/xetex/2006-November/005435.html).
As everybody knows, if one wants to get words with a hyphen hyphenated,
(s)he should assign a non-zero lccode to the hyphen-minus character
and assign another (alternate) character code to the \hyphenchar
primitive. However this solution has an undesired side effect: if
a hyphenation break occurs exactly at the place where there is already
a hyphen, then TeX breaks the word adding one more hyphen character.
In standard (pdf)LaTex this effect is worked around by adding
a pseudo-ligature (Minus-sign + Hyphenchar -> Hyphenchar) to fonts
which support an alternate hyphenchar glyph. AFAIK, XeTeX currently
doesn't handle this situation by a reasonable way, and this is a pity.

The most obvious solution seems to be adding the following mappings
to tex-text:

U+002D + U+2010 => U+2010
U+002D + U+00AD => U+00AD

Of course changes at the engine level might also be reasonable:

1) XeTeX might allow breaks after a hyphen-minus, no matter, what its
lccode is, as it currently does for emdash and endash if
\XeTeXdashbreakstate is enabled;

2) or it might be possible to allow breaking compound words at places
other than the hyphen by default, thus making an alternate hyphenchar
unnecessary. This behavior might be toggled, e. g., by setting
\XeTeXlinebreaklocale to a value different from "en".

-- 
Regards,
Alexey Kryukov <anagnost at yandex dot ru>

Moscow State University
Historical Faculty


More information about the XeTeX mailing list