[luatex] problem with slnunicode's find

Manuel Pégourié-Gonnard mpg at elzevir.fr
Tue Mar 2 22:32:54 CET 2010


By the way, I re-read the "documentation" of slnunicode and found interesting
things.

Patrick Gundlach a écrit :
> That would break every program that uses unicode.utf8 as a replacement for
> string, which is meant for.

The documentation says: "ascii or latin1 can be used as locale-independent
string replacement." While it doesn't formally says that unicode.utf8 can *not*
be used as a string replacement, it is quite implied by not mentioning it.

> And what would you expect in this case:
> 
> my_three_numbers = "\97\195\182" unicode.utf8.find(my_three_numbers,"\182")
> 
Here the fun begins :-) According to the doc:

-- Any byte not part of such a sequence is treated as it's (Latin-1) value.

Doesn't it mean that "\182" should be treated as "\194\182" (the proper utf8
encoding of U+00B6, aka ¶)? If so, then indeed the return value should be nil.

(Note that, as a matter of personal taste, I tend to disagree with interpreting
an out-of-sequence byte as a latin1 character, but the point here is that the
function just doesn't do what the doc says: precisely what I usually call a bug.)

Manuel.



More information about the luatex mailing list