[luatex] documentation on builtin utf library
taco at elvenkind.com
Tue Feb 16 08:07:31 CET 2010
Khaled Hosny wrote:
> On Fri, Feb 05, 2010 at 08:11:40AM +0100, Arthur Reutenauer wrote:
>>> could you please point me to some documentation on the selene utf
>>> library that is mentioned in the luatex reference?
>> I don't think there is any, but the test file that comes with the library
>> sources should provide some indications, as well as the general idea that the
>> unicode.ascii, unicode.utf8 and unicode.grapheme each provide the same
>> functions as the standard string library (i.e., unicode.utf8.gmatch is a global
>> match function for UTF-8 strings just like string.gmatch is for ASCII strings,
> I was wondering, since luatex defaults to utf-8 every where, why the
> built-in non-unicode compliant string library isn't overridden by the
> unicode library? So instead of having to libraries, make the standard
> one unicode compliant and get ride of the separate unicode library. This
> would decrease the confusion that is made right now and avoid bugs
> caused by code not aware of this important fact? Have this ever been
> considered, may be there are technical difficulties?
Not technical difficulties, but practical ones. The normal string
library allows arbitrary bytes to be handled, which is very useful
functionality. And as the string library has to stay, it may as well
stay in its normal form, as that avoids confusion.
I have thought about replacing the selene unicode library with
something else, but that is a low-priority task.
More information about the luatex