[XeTeX] Khmer: ligatures break if XeTeXlinebreaklocale is turned on

Jonathan Kew jfkthame at gmail.com
Tue Apr 10 21:02:15 CEST 2018


On 10/04/2018 16:55, Jo Hund wrote:
> Hi there,
> 
> I was told by David Carlisle on https://tex.stackexchange.com that the 
> xetex mailing list may be a good place to ask my question:
> 
> When generating documents in Khmer, we noticed that some ligatures do 
> not work. We found that turning off XeTeXlinebreaklocale fixes all 
> ligatures, however this causes problems with linebreaks, and lines 
> extend past the right margin if there aren't any zero width spaces 
> between words. (In Khmer words within the same sentence or phrase are 
> generally run together with no spaces between them.)
> 
> Our objective is to have all ligatures work, and to use 
> XeTeXlinebreaklocale at the same time.
> 
> Please note that we are not looking for a manual workaround for this 
> particular example. We are looking for a fix to the root cause that we 
> can then use in our automated system where we cannot fix individual 
> instances of this problem.

The problem arises because activating XeTeXlinebreaklocale effectively 
makes xetex insert something like \penalty0 or \hskip0pt or similar 
(depending on the settings of \XeTeXlinebreakpenalty and ...skip) at 
each potential line-break position, so that the normal TeX line-breaking 
algorithm will be able to find and use these breaks.

But the inserted penalty and/or skip interrupts the sequence of 
characters that is passed to the OpenType shaping engine, and so 
features like ligatures will not work across the boundary.

A possible workaround would be to set \XeTeXinterwordspaceshaping=2 in 
your document. This will cause xetex to re-shape runs of text after 
line-breaking, and at this point your ligatures should work.

There are some caveats: in particular you'll notice if you try this that 
your red coloring of the example text fragment gets lost. This is 
because the \special{}s that \color inserts will be moved out of the run 
of text that is now being shaped as a unit. But depending on the needs 
of your documents, this may be an acceptable trade-off.

Oh, by the way: you can change the \XeTeXinterwordspaceshaping setting 
within the document if you like, but its effect does not respect the 
usual TeX scoping rules; if I remember correctly, 
\XeTeXinterwordspaceshaping=2 basically operates on a whole-page basis, 
so what matters is the value at the time the page is completed.

HTH,

JK


More information about the XeTeX mailing list