[XeTeX] hyphenation in Ethiopian languages

Mojca Miklavec mojca.miklavec.lists at gmail.com
Fri May 6 19:03:25 CEST 2011

On Thu, Nov 4, 2010 at 14:42, Adam McCollum wrote:
> Dear list members,
> I've recently drawn up a short document in Ge`ez (classical Ethiopic) using
> Polyglossia and I see that the hyphenation is wrong. As some of you know,
> languages that use the Ethiopic script, including Ge`ez and Amharic, place a
> word divider—it looks somewhat like a thick colon—between each word and two
> of these dividers side by side between sentences; see some Amharic examples
> here. That being the case, a word may be broken at any syllable (the script
> is a syllabary, not an alphabet) at the end of a line, but there is nothing
> corresponding to a hyphen. An additional matter of importance is that no
> line should begin with the single or double word divider. How should this be
> fixed?

Dear Adam,

We have submitted Ethiopic hyphenation patterns to CTAN (and TL) a
while ago, so once you update you TeX Live, it should work out of the

However there is a nasty limitation in XeTeX: words hyphenate only up
to 64 characters, so unless somebody fixes XeTeX, you need other
tricks and workarounds. The code below inserts a breakable space
before every word separator (and thus allows XeTeX to start breaking
the next word from scratch). In addition to that you also need to make
sure that:
- there is no hyphenation character at the end of line
- lines are properly aligned
- you might want (or not) some extra space around word and sentence delimiters

Together with Arthur we created the following working example, but it
would be great if François would include some of that code into

If you want to have space around word delimiters, you need to create
some non-breakable space in front of delimiter and some breakable
space after the delimiter. The amount of space might need to be
configurable. My estimates might not be the best ones (0.4 +/- 0.1
em), so feel free to fix to the most suitable values. Apart from that
you might want to have both spaces of equal size (I wasn't sure how to
achieve that).

\newfontfamily\amharicfont[Script = Ethiopic, Scale = 1.3]{Abyssinica SIL}

\newXeTeXintercharclass \ethiletter
\newXeTeXintercharclass \ethispace
    \advance\tmp by 1


\XeTeXinterchartoks \ethispace \ethiletter = {\egroup\hskip.4em plus
.1em minus .1em}
\XeTeXinterchartoks \ethiletter \ethispace = {\kern.4em\bgroup}

\title{Sample in Gǝ`ǝz}

% \hsize=8cm




Please let us know if that works the way you want it to work. If you
need a LuaTeX solution, please let us know as well.


PS: You could also simply use
    \XeTeXinterchartoks \ethiletter \ethiletter = {\hskip0pt}
and thus avoid the need for any hyphenation patterns at all.

More information about the XeTeX mailing list