[XeTeX] xetex and the unicode bidirectional algorithm.
zdenek.wagner at gmail.com
Mon Dec 9 17:15:20 CET 2013
2013/12/9 <mskala at ansuz.sooke.bc.ca>:
> On Mon, 9 Dec 2013, Khaled Hosny wrote:
>> > U+E0001 U+E0065 U+E006E U+0073 U+0061 U+006E U+0067
>> And it is a kind of tagging, so beyond the scope of identifying the
>> language of *untagged* text (which is the claim that spurred all this
> The claim was "A properly encoded utf-8 string should contain everything
> you need!". If you forbid using Unicode tag characters, then you're
> saying "It is impossible to encode language in Unicode when you're not
> allowed to use the features designed for that purpose," which is not
> an interesting statement.
> Yes, of course some kind of tagging is needed. Keith seems to think that
> the tagging will magically come from "proper" UTF-8, and of course he's
> wrong. I think language tagging would be possible in pure Unicode, as the
> string above demonstrates, but that's not a good way to do it. The really
> original question had to do with RTL versus LTR detection, not language
> detection, and that's a different issue.
> Unicode specifies a way to detect RTL versus LTR, such that in many cases
> it doesn't require tagging. Unicode's way of doing it may or may not be a
> good one, but we cannot reasonably pretend that it doesn't exist. The
> Unicode bidi algorithm does exist. XeTeX does not implement the Unicode
> bidi algorithm. The interesting remaining question is whether XeTeX
> should implement it. I tend to think not - because if we implement it,
> people will blame us for its failings. It'd also be a lot of work, break
> compatibility with the rest of the TeX world, STILL require tagging in
> many cases, and so on.
A bit off topic, dou you know a good Linux text editor woth properly
implemented bidi algorithm so that I could type multilingual texts?
Evne the combination of Urdu and TeX macros is a pain, it is not easy
کو سب کچھ کیا۔}
I am not able to type it on a single line, gedit, kate and even gmail
and facebook get confused and create garbage if I mix LTR and RTL
scripts.. I can only use a commercial XML editor that allows me to
combine text in a latin script with texts in Hindi and Urdu.
> Matthew Skala
> mskala at ansuz.sooke.bc.ca People before principles.
> Subscriptions, Archive, and List information, etc.:
More information about the XeTeX