[XeTeX] xetex and the unicode bidirectional algorithm.

Khaled Hosny khaledhosny at eglug.org
Mon Dec 9 10:38:16 CET 2013


On Mon, Dec 09, 2013 at 09:22:10AM +0100, Keith J. Schultz wrote:
> Hi Khaled,
> 
> your question can not be serious!

No, it is.

> It is pretty much in the standard! 

No.

> True enough that for most western languages american, english, spanish,
> german, austrian, etc. this is somewhat difficult. Yet, these are not causing the problems.

You can’t identify the language of a Unicode string just by examining
the Unicode properties for the characters in that string, simply because
such Unicode property does not exist. Language identifications involves
quite some statistical analysis[1]. You can identify scripts using
Unicode properties quite reliably, though.

1. https://en.wikipedia.org/wiki/Language_identification#Statistical_approaches

Regards,
Khaled

> regards
> 	Keith.
> 
> Am 05.12.2013 um 09:46 schrieb Khaled Hosny <khaledhosny at eglug.org>:
> 
> > On Thu, Dec 05, 2013 at 09:41:04AM +0100, Keith J. Schultz wrote:
> >> Hi Scott,
> >> 
> >> We are talking Unicode here right! What is there to guess? 
> > 
> > And how do you, using Unicode, tell in what language is this line
> > written?
> > 
> 
> 
> 
> 
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex


More information about the XeTeX mailing list