[XeTeX] xetex and the unicode bidirectional algorithm.

Philip Taylor P.Taylor at Rhul.Ac.Uk
Mon Dec 9 14:32:08 CET 2013

Keith -- could you possible supply an example of
"a properly encoded utf-8 string" from which it
can be unambiguously determined whether the string
"sang" is an English word (the past tense of "sing")
or a Vietnamese word meaning "to", "posh" or "knowingly"
in English ?  Could you also paste that string into
Richard Ishida's Unicode String Analyser :


and let us know what information it returns ?

Philip Taylor

Keith J. Schultz wrote:

> Unfortunately, for efficiency reasons, utf-8 strings are not properly
> encoded and programs assume a particular language, to save space.
