[XeTeX] Turning off polyglossia's "bidi" algorithm.

Khaled Hosny khaledhosny at eglug.org
Sun Jan 12 09:48:15 CET 2014


On Fri, Jan 10, 2014 at 05:43:10PM -0500, C. Scott Ananian wrote:
> Now both final lines display as "24) April .(2008". That is, the
> parentheses and period have been reordered, even when I explicitly
> request LTR mode.  What's going on here?  How do I turn this
> (mis)feature off?

That seems to be an effect of using an Arabic (Script=Arabic) font, a
minimal LaTeX file to demonstrate this, though I can’t immediately
explain why it is happening, though:

\documentclass{minimal}
\usepackage{fontspec}
\begin{document}
\fontspec[Script=Arabic]{Amiri}
Williams، Richard (24 April 2008).
\end{document}

In your poliglossia example, this is a result of calling
\defaultlanguage{arabic} which will use \arabicfont as the main document
font (which is not something you want for non-Arabic text anyway, notice
the different period you get with Amiri when script is set to Arabic).

An even more minimal, Plain file can demonstrate this, so it is an
engine issue:

\font\amiri="Amiri"\amiri 
Williams، Richard (24 April 2008). 

\font\amiri="Amiri:script=arab"\amiri
Williams، Richard (24 April 2008).
\bye

My wild guess is that, based on the “arab” script, we consider runs like
“(24” and “2008)” that do not have any characters with strong
directionality to have a right to left base direction, so the
parenthesis end up moved to the other side.

Minutes later: I checked the code and this indeed the case, I dodn’t
know why this was done, but has been like that since the first XeTeX
commit I can track.

One ugly workaround for this specific is to use no break space inside
the parenthesis, so it ends up processed as one “word”, but you loose
the ability to break line here of course.

\font\amiri="Amiri:script=arab"\amiri
Williams، Richard (24 April 2008).
\bye

Regards,
Khaled


More information about the XeTeX mailing list