[XeTeX] xetex and the unicode bidirectional algorithm.

Khaled Hosny khaledhosny at eglug.org
Mon Dec 9 15:29:17 CET 2013


On Mon, Dec 09, 2013 at 08:16:03AM -0600, mskala at ansuz.sooke.bc.ca wrote:
> On Mon, 9 Dec 2013, Philip Taylor wrote:
> > Keith -- could you possible supply an example of
> > "a properly encoded utf-8 string" from which it
> > can be unambiguously determined whether the string
> > "sang" is an English word (the past tense of "sing")
> 
> I'll probably regret pointing this out, and the characters involved have
> been deprecated since Unicode 5, but:
> 
>    U+E0001 U+E0065 U+E006E U+0073 U+0061 U+006E U+0067

And it is a kind of tagging, so beyond the scope of identifying the
language of *untagged* text (which is the claim that spurred all this
discussion).

Regards,
Khaled


More information about the XeTeX mailing list