[luatex] Indic scripts

maxwell maxwell at umiacs.umd.edu
Fri Mar 23 20:58:15 CET 2018


I am just finishing up a project in which we are typesetting texts in 
various languages, and outputting in a separate XML file the bounding 
boxes corresponding to each line of text in the PDF.  It may be possible 
to do this in other varieties of *TeX, but LuaTeX made it easy to use a 
few Lua functions to output this information to the XML.

Unfortunately, we discovered at the very end of this project that LuaTeX 
is mis-rendering Tamil text.  In particular, vowel signs often do not 
appear in the correct position: they are encoded in the underlying 
Unicode text to the right of a consonant, which corresponds to the 
pronunciation order, but should sometimes appear visually to the left of 
that consonant.  LuaTeX does not perform that visual re-ordering.  I am 
also told (by someone who knows Tamil) that LuaTeX is mis-rendering some 
vowel signs that ought to combine with the preceding consonant letters, 
but I'd be hard-pressed to come up with good examples.  (Fact is, I 
don't know much about these scripts, which is why I didn't discover the 
problem until the end of the project...)

Tamil and Bengali are only two of the scripts for which this problem 
arises; most Indic scripts (and probably other complex scripts, like 
Burmese) would probably have the same problems.  There's a discussion of 
the proper rendering of Indic scripts in 
http://www.unicode.org/versions/latest/ch12.pdf (Tamil is in section 
12.6).

Rendering of Indic scripts in LuateX appears to be a known issue, e.g. 
the following thread in this mailing list from 2011: 
http://tug.org/pipermail/luatex/2011-February/002554.html.  Also this 
posting from 2016/2017: 
https://tex.stackexchange.com/questions/285527/how-can-bengali-v-2-bng2-be-used-to-typeset-bengali-with-xetex. 
(The topic was Bengali, but the points are relevant to other Indic 
scripts.  There's a comment down near the bottom from yours truly.)

Is this on someone's radar?  It effectively makes it impossible to 
render text in complex scripts using LuaTeX.  It works correctly in 
XeTeX, presumably because XeTeX uses a different rendering engine to 
create the PDF.

    Mike Maxwell
    University of Maryland


More information about the luatex mailing list