[luatex] Indic scripts
maxwell
maxwell at umiacs.umd.edu
Fri Mar 23 20:58:15 CET 2018
I am just finishing up a project in which we are typesetting texts in
various languages, and outputting in a separate XML file the bounding
boxes corresponding to each line of text in the PDF. It may be possible
to do this in other varieties of *TeX, but LuaTeX made it easy to use a
few Lua functions to output this information to the XML.
Unfortunately, we discovered at the very end of this project that LuaTeX
is mis-rendering Tamil text. In particular, vowel signs often do not
appear in the correct position: they are encoded in the underlying
Unicode text to the right of a consonant, which corresponds to the
pronunciation order, but should sometimes appear visually to the left of
that consonant. LuaTeX does not perform that visual re-ordering. I am
also told (by someone who knows Tamil) that LuaTeX is mis-rendering some
vowel signs that ought to combine with the preceding consonant letters,
but I'd be hard-pressed to come up with good examples. (Fact is, I
don't know much about these scripts, which is why I didn't discover the
problem until the end of the project...)
Tamil and Bengali are only two of the scripts for which this problem
arises; most Indic scripts (and probably other complex scripts, like
Burmese) would probably have the same problems. There's a discussion of
the proper rendering of Indic scripts in
http://www.unicode.org/versions/latest/ch12.pdf (Tamil is in section
12.6).
Rendering of Indic scripts in LuateX appears to be a known issue, e.g.
the following thread in this mailing list from 2011:
http://tug.org/pipermail/luatex/2011-February/002554.html. Also this
posting from 2016/2017:
https://tex.stackexchange.com/questions/285527/how-can-bengali-v-2-bng2-be-used-to-typeset-bengali-with-xetex.
(The topic was Bengali, but the points are relevant to other Indic
scripts. There's a comment down near the bottom from yours truly.)
Is this on someone's radar? It effectively makes it impossible to
render text in complex scripts using LuaTeX. It works correctly in
XeTeX, presumably because XeTeX uses a different rendering engine to
create the PDF.
Mike Maxwell
University of Maryland
More information about the luatex
mailing list