[luatex] Vital RTL issues in LuaTeX

Yannis Haralambous yharalambous at me.com
Fri Apr 10 13:41:08 CEST 2009

Le 10 avr. 09 à 13:29, وفا خلیقی a écrit :

> I, personally, don't think we need to code any thing at the engine
> level, this just make things more complex and less flexible, lets take
> the example Taco pointed out, if we applied bidi algorithm on 3.5
> (chapter 3, section 5) it'll stay 3.5 regardless of text direction,
> however it should be 5.3 in RTL, and some people might disagree and  
> say
> it should be 3.5 in all cases. If BiDi was hard coded in th engine,  
> it'll
> just make things more complex, and one will need to put a Unicode  
> 'right
> to left mark' before the dot. The same can be said for numbered lists
> etc.
> You picked up the very wrong example. That is all I can say.  No,  
> absolutely the same can not be said about numbers and etc.

Bidi algorithm is certainly not perfect but it handles most cases  
correctly and it gives you the possibility to force a given  
directional behaviour, if needed.

The BIG advantage of the bidi algorithm is that it is a standard.  
Every piece of software will represent the same Unicode string in the  
same way. For example, if you keyboard your text using a bidi- 
compatible text editor, then you will get the same output as what  
you'll see on your screen. And once you get used to the behaviour of  
text with respect to the five bidi special characters, then you can  
keyboard RTL languages in the same way everywhere. That's an  
improvement to previous input methods.

I'm convinced that it is a good thing to implement bidi, *but* it may  
not be trivial since bidi needs information coming from higher-level  
protocols about the main direction in the context of the document. For  
example, in the case of XHTML you have a dir attribute. Here is the  
correponding text:

Override P3, and set the paragraph embedding level explicitly.

A higher-level protocol may set the paragraph level explicitly and  
ignore P3. This can be done on the basis of the context, such as on a  
table cell, paragraph, document, or system level.
HL2.	Override W2, and set EN or AN explicitly.
A higher-level process may reset characters of type EN to AN, or vice  
versa, and ignore W2. For example, style sheet or markup information  
can be used within a span of text to override the setting of EN text  
to be always be AN, or vice versa.
HL3.	Emulate directional overrides or embedding codes.
A higher-level protocol can impose a directional override or embedding  
on a segment of structured text. The behavior must always be defined  
by reference to what would happen if the equivalent explicit codes as  
defined in the algorithm were inserted into the text. For example, a  
style sheet or markup can set the embedding level on a span of text.
HL4.	Apply the Bidirectional Algorithm to segments.
The Bidirectional Algorithm can be applied independently to one or  
more segments of structured text. For example, when displaying a  
document consisting of textual data and visible markup in an editor, a  
higher-level process can handle syntactic elements in the markup  
separately from the textual data.
HL5.	Provide artificial context.
Text can be processed by the Bidirectional Algorithm as if it were  
preceded by a character of a given type and/or followed by a character  
of a given type. This allows a piece of text that is extracted from a  
longer sequence of text to behave as it did in the larger context.
HL6.	Additional mirroring.
Characters with a resolved directionality of R that do not have the  
Bidi_Mirrored property can also be depicted by a mirrored glyph in  
specialized contexts. Such contexts include, but are not limited to,  
historic scripts and associated punctuation, private-use characters,  
and characters in mathematical expressions. (See Section 6,Mirroring.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://tug.org/pipermail/luatex/attachments/20090410/8064dab7/attachment-0001.html 

More information about the luatex mailing list