[XeTeX] xetex and the unicode bidirectional algorithm.

C. Scott Ananian cscott at cscott.net
Thu Dec 5 13:48:25 CET 2013

Can anyone point me to docs on XeT--TeX?  A Google the other day failed to
turn up anything useful.

Also: polyglossia appears to be doing some amount of LTR/RTL directionality
switching based on the character block.  Can anyone offer advice on how to
avoid fighting with that, if I'm implementing my own bidi algorithm?

Finally: any advice on using CJK languages with polyglossia?  Embedded CJK
is quite common.  Should I be writing gloss-ja etc files to set the right
directionality and font and get the appropriate CJK support packages loaded?
On Dec 5, 2013 5:42 AM, "Jonathan Kew" <jfkthame at googlemail.com> wrote:

> On 4/12/13 13:24, C. Scott Ananian wrote:
>> The goal is to match the Unicode bidi algorithm, because that is how the
>> web page displays and thus how the original author saw the text as they
>> wrote.
> This would be a nice enhancement, but would require a significant amount
> of work (or in other words, it's not likely to get implemented quickly, if
> at all).
> Currently, typesetting bidi text with xetex requires correct use of the
> TeX--XeT bidi commands (\beginR, \endR, \beginL, \endL) to mark up the text
> direction. These could be used directly, or via higher-level markup that's
> tagging script and language, but you definitely need them to be present in
> some way.
> Sorry, that's not what you want to hear, but it's how things are. At this
> point, I think the most practical way forward in your situation is probably
> to implement this as part of whatever tool is taking the wikipedia content
> and converting it to (Xe)LaTeX markup - that tool could inspect the content
> of each element it's processing, and add any necessary direction controls
> for XeTeX.
> JK
>  Guessing the proper language tag to use is likely infeasible;
>> note that the example given contains titles in Turkish as well as
>> English.  The safest option is probably to treat embedded LTR text in an
>> RTL context as 'exotic' and not to attempt hyphenation.
>> I've heard it said that LuaTeX has "better bidi support".  What does
>> that mean, exactly? Should I be considering switching?
>>    --scott
>> On Dec 4, 2013 4:08 AM, "Keith J. Schultz" <schultzk at uni-trier.de
>> <mailto:schultzk at uni-trier.de>> wrote:
>>     Hi Scott,
>>     Am 03.12.2013 um 19:42 schrieb C. Scott Ananian <cscott at cscott.net
>>     <mailto:cscott at cscott.net>>:
>>      >
>>      > But in the XeLaTeX/polyglossia/bidi output, the "soft space" weak
>>      > directionality of the Unicode BiDi algorithm doesn't seem to be
>>      > honored (or implemented?) and so the English article titles appear
>>      > with the individual words in RTL order, which is a mess.  Manually
>>      > tagging the language of the article title is probably the Right
>>     thing,
>>      > but infeasible for the entire wikipedia.
>>              Well, without proper tagging you can not expect any system to
>>              work properly or as expected!
>>              For most entries a simple script should do the trick to add
>> the
>>              language tags to the article titles.
>>     Hope this helps
>>              regards
>>                      Keith.
>>     --------------------------------------------------
>>     Subscriptions, Archive, and List information, etc.:
>>     http://tug.org/mailman/listinfo/xetex
>> --------------------------------------------------
>> Subscriptions, Archive, and List information, etc.:
>>    http://tug.org/mailman/listinfo/xetex
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20131205/ff65e2a5/attachment-0001.html>

More information about the XeTeX mailing list