[XeTeX] xetex and the unicode bidirectional algorithm.

Wed Dec 11 03:37:39 CET 2013

On Wed, Dec 11, 2013 at 04:36:31AM +0200, Khaled Hosny wrote:
> On Tue, Dec 10, 2013 at 11:11:27AM -0500, C. Scott Ananian wrote:
> > On Tue, Dec 10, 2013 at 6:09 AM, Zdenek Wagner <zdenek.wagner at gmail.com> wrote:
> > > 2013/12/10 Keith J. Schultz <keithjschultz at web.de>:
> > >> I will repeat I do not know Vietnamese so I can not give you
> > [...]
> > >> Now, if "sang" is true Vietnamese and not a latinized form stand corrected! Though I have
> > [...]
> > > Yes, it is true Vietnamese word. I do not know Vietnamese, I could
> > 
> > https://www.google.com/search?q=sang+site%3Avi.wikipedia.org
> > 
> > ..which is indeed the issue I am attempting to deal with (trying to
> > put the discussion back on track) -- a bunch of user authored content
> > which looks correct to a native speaker when using the unicode bidi
> > algorithm (implemented in the browser).  Language tags are only
> > applied sporadically when needed to correct some obvious issue --
> > although the future Visual Editor project at wikimedia hopes to make
> > language tagging a more integrated part of the editing process.
> > 
> > Language tagging uses the HTML <span lang="...." dir="...."> standard.
> >  Directionality tagging uses <bdo> and <bdi> where necessary.  But
> > again, the point of the bidi algorithm is to avoid the necessity of
> > manual tagging in many cases.
> > 
> > Ultimately, wikipedias goal is to allow the largest number of
> > individual authors the ability to create encyclopedic content in their
> > language as easily as possible.  Our greatest challenge is the "as
> > easily as possible" part.  We can't impose language tagging as a
> > barrier to entry, when it is not necessary for the author's text to be
> > readable and useful to the public.
> 
> There is a big difference between (barely) readable text and
> typographically correct one, if your goal is only the former, this
> language tagging can be skipped (and you can forget about hyphenation,
> too, except for the main document language which is, hopefully, already
> known).
> 
> This leaves you with the BiDi algorithm, for which there exists many
> implementations that you might be able to use while processing your text
> before generating TeX files. There even exists a TeX pre-processor that
> can apply BiDi algorithm to TeX documents, that you might be able to use
> or adapt (I never used it myself, and it was written for e-TeX but XeTeX
> RTL model is essentially the same, so it should work in theory).

http://biditex.sourceforge.net/

Regards,
Khaled