[lltx] adapting unicode-letters to luatex

Khaled Hosny khaledhosny at eglug.org
Sat Jun 12 19:54:51 CEST 2010

On Sat, Jun 12, 2010 at 07:34:20PM +0200, Manuel Pégourié-Gonnard wrote:
> Hi,
> I'm following up on the recent discussion on the Luatex mailing list concerning
> hyphenation in Greek not working properly with Lualatex, due to the various
> \lccode's not being set properly.
> I can see that in Xetex-based Plain and Latex formats, the file
> unicode-letters.tex is \input-ed and takes care of such things. This file is
> generated by the Perl script unicode-char-prep.pl shipped with the Xetex sources
> and taking Unicode data files as its inputs (available from
> <http://unicode.org/Public/UNIDATA/>).
> As is, unicode-letters.tex cannot be usefully \input-ed by LuaTeX, since it
> makes use of primitives (\XeTeXmathcode) and even concepts (intercharclass and
> interchartoks) specific to Xetex. However, a big part of it (uccodes, lccodes,
> sfcodes, catcodes, maybe mathcodes to by just adapting the primitive name) could
> be shared with Luatex.
> I see at least to possible ways to handle this:
> 1. Adapt unicode-char-prep.pl so that the generated file can be \input'ed by
> both Xetex and Luatex.
> 2. Make a Luatex-specific version.
> I've got no strong opinion on 1 vs 2, but is seems to me that it's probably
> easier to have a separate Luatex version, to avoid bothering you, Jonathan,
> every time something needs to be changed for LuaTeX.
> What do you guys think?
> Either way, I'm interested in working on it (in case the only problem is to find
> someone to do it).

Since it is a generated file, I see no benefit in sharing it, adapting
the perl script for luatex makes less conflict.

A loosely related issue that I was considering, is to have a Unicode
database in luatex-base, some thing like char-def.lua in ConTeXt, but
auto-generated for maintainability (Unicode is ever growing). It would be
very useful for people writing code that deals with character
proprieties (my lua bidi for example). If we ever had such database, we
can generate the required TeX file from it easily, I can also imagine
an option for doing it on the fly if one really wants.

What do you think?


