[luatex] Behavior of slnunicode.utf8.match().

Wed Aug 10 08:28:07 CEST 2011

Selon Stephan Hennig <mailing_list at arcor.de>:

> schrieb Manuel Pégourié-Gonnard:
> > My personal, certainly flawed, recollection of the design principle
> > is that for length and counting, the unit is always the byte, whereas
> > for the rest the working unit is the (possibly multibyte) character.
>
> Sounds good (and more general), except that I'd replace 'length' by
> 'position', because I think unicode.utf8.len is indeed UTF-8 aware (but
> I didn't check now).

I see; not consistent to me, but at least it explains some things ... except
"slnunicode.sub("éî", 2, 2)" returns "î" and not the second byte of "é", so
obviously there are exceptions. (Or did I get something wrong again?)

> To cite the last sentence from the current paragraph:
>
> > The slnunicode library will be replaced by an internal Unicode
> > library in a future LuaTeX version.
>
> Taco, can you give a guess about the number of that "future LuaTeX version"?

I'm waiting for that too!

Best,
Paul