[lltx] adapting unicode-letters to luatex

Reinhard Kotucha reinhard.kotucha at web.de
Sat Jun 12 23:59:32 CEST 2010

On 12 June 2010 Khaled Hosny wrote:

 > On Sat, Jun 12, 2010 at 07:34:20PM +0200, Manuel Pégourié-Gonnard wrote:
 > > Hi,
 > > 
 > > I'm following up on the recent discussion on the Luatex mailing
 > > list concerning hyphenation in Greek not working properly with
 > > Lualatex, due to the various \lccode's not being set properly.
 > > 
 > > I can see that in Xetex-based Plain and Latex formats, the file
 > > unicode-letters.tex is \input-ed and takes care of such
 > > things. This file is generated by the Perl script
 > > unicode-char-prep.pl shipped with the Xetex sources and taking
 > > Unicode data files as its inputs (available from
 > > <http://unicode.org/Public/UNIDATA/>).
 > > 
 > > As is, unicode-letters.tex cannot be usefully \input-ed by
 > > LuaTeX, since it makes use of primitives (\XeTeXmathcode) and
 > > even concepts (intercharclass and interchartoks) specific to
 > > Xetex. However, a big part of it (uccodes, lccodes, sfcodes,
 > > catcodes, maybe mathcodes to by just adapting the primitive name)
 > > could be shared with Luatex.
 > > 
 > > I see at least to possible ways to handle this:
 > > 1. Adapt unicode-char-prep.pl so that the generated file can be
 > > \input'ed by both Xetex and Luatex.
 > > 2. Make a Luatex-specific version.
 > > 
 > > I've got no strong opinion on 1 vs 2, but is seems to me that
 > > it's probably easier to have a separate Luatex version, to avoid
 > > bothering you, Jonathan, every time something needs to be changed
 > > for LuaTeX.
 > > 
 > > What do you guys think?
 > > 
 > > Either way, I'm interested in working on it (in case the only
 > > problem is to find someone to do it).
 > Since it is a generated file, I see no benefit in sharing it,
 > adapting the perl script for luatex makes less conflict.

I think that this is exactly what Manuel suggested.  Sure, it doesn't
make sense to change the generated file.  But the question is whether
it's easy enough to add support for LuaTeX.  Since Jonathan wrote the
script with XeTeX in mind, I don't know whether it's easy to provide
another back-end.  If someone is going to re-write the script, it
makes sense to have a generic internal representation (a hash in Perl
or a table in Lua from which one can create different output formats.

 > A loosely related issue that I was considering, is to have a
 > Unicode database in luatex-base, some thing like char-def.lua in
 > ConTeXt, but auto-generated for maintainability (Unicode is ever
 > growing). It would be very useful for people writing code that
 > deals with character proprieties (my lua bidi for example). If we
 > ever had such database, we can generate the required TeX file from
 > it easily, I can also imagine an option for doing it on the fly if
 > one really wants.

I agree with you that it makes sense to use the Unicode databases.
But I'm not convinced that it should be done on-the-fly.


Reinhard Kotucha			              Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover	                      mailto:reinhard.kotucha at web.de
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.

More information about the lualatex-dev mailing list