[luatex] problem with slnunicode's find

Stephan Hennig mailing_list at arcor.de
Tue Mar 2 18:58:32 CET 2010


Am 02.03.2010 16:11, schrieb Patrick Gundlach:
>> is a bit misleading, since just unicode.utf8.find is again not
>> Unicode-aware.
>
>
> This is not true. It just returns the position in bytes. What would
> you suggest the following statement returns?
>
> str="aö" unicode.utf8.find(str,"\182")  -- (ö's utf8 values are 195
> and 182)

Nil, or even better error out, since the second argument is invalid.  Do 
you think 3 is a sensible result?

This can lead to subtle errors, e.g., when strings are represented in
UTF-8 in a programme, but a user accidentally inputs a pattern in, say, 
latin1.  A wrong match might be undiscovered for a long time, while a 
non-match might catch the user's attention instantly.

Is there a function in slnunicode that checks a string for UTF-8 compliance?

Best regards,
Stephan Hennig


More information about the luatex mailing list