[XeTeX] How do mapping files affect hyphenation?

Zdenek Wagner zdenek.wagner at gmail.com
Fri Feb 24 14:32:48 CET 2012

2012/2/24 Ulrike Fischer <news3 at nililand.de>:
> Am Fri, 24 Feb 2012 13:40:53 +0100 schrieb Zdenek Wagner:
>>> From an old discussion on the mailing list
>>> (http://tug.org/mailman/htdig/xetex/2005-November/002842.html) I got
>>> the impression that mappings are "invisible" to the hyphenation
>>> routine and so I would have expected the translittered text to be
>>> hyphenated according the original russian rules but actually it is
>>> not hyphenated at all:
>> Do the Russian hyphenation patterns contain patterns for the Latin
>> script?
> Well I would say, obviously not.
> But my question is if "...font mappings (unlike traditional
> TFM-based ligatures) are completely invisible to TeX's line-breaking
> process" how comes that the hyphenation routine *knows* that chars
> from a latin script are involved? Why doesn't it handle the
> hyphenation with the input chars?
See my previous message. The characters are transliterated when they
enter horizontal list and the resulting horizontal list is sent to the
paragraph breaking algorithm. At that time TeX sees the Latin
characters and does not know that they were entered as cyrillic.

One unrelated note: since the mapping is applied after macro
expansion, you can store the text in a macro and typeset it twice in
two ways. the following will therefore work:

\def\sometext{Москва — столица Российской Федерации, город
федерального значения,
административный центр Центрального федерального округа и центр
Московской области, в состав которой не входит. Крупнейший по
численности населения город России и Европы (население на 1 января
2012 года — 11 629 116 человек), по этому показателю входит в
десятку крупнейших городов мира. Центр Московской городской


\transrussian \sometext

