[luatex] problem with slnunicode's find

Stephan Hennig mailing_list at arcor.de
Thu Mar 4 22:07:38 CET 2010


Am 04.03.2010 21:24, schrieb Khaled Hosny:
> On Thu, Mar 04, 2010 at 08:08:56PM +0000, Jonathan Fine wrote:
>> Yes, Stephan, they do start at zero.  So for Unicode in Lua, Python
>> would be a good example to study and perhaps follow.
>
> Actually, python<  3.0 is a horrible mess Unicode-wise and I'd never
> ever try to follow it. Python is my favorite programming language, but
> I'd never call its Unicode support a "good example".

But I think Jonathan's Python example can act as an indicator that 
slnunicode does things wrong[1], or at least it doesn't comply to 
conventions even if its behaviour is documented.

In Python len and index always keep consistent about what the length of 
the string is, whether they treat it as a UTF-8 string or a byte sequence.

>> '\xc3\xa4b' äb 3 2
>> '\xc3\xb6\xc3\xa4b' öäb 5 4
>> u'\xe4b' äb 2 1
>> u'\xf6\xe4b' öäb 3 2

Best regards,
Stephan Hennig

[1] I know that a sample of one language might be a bit weak an 
argument.  But I have yet to see a language supporting slnunicode's 
behaviour.  Or a use-case, that doesn't qualify as a programming error.


More information about the luatex mailing list