[luatex] problem with slnunicode's find
mailing_list at arcor.de
Thu Mar 4 22:07:38 CET 2010
Am 04.03.2010 21:24, schrieb Khaled Hosny:
> On Thu, Mar 04, 2010 at 08:08:56PM +0000, Jonathan Fine wrote:
>> Yes, Stephan, they do start at zero. So for Unicode in Lua, Python
>> would be a good example to study and perhaps follow.
> Actually, python< 3.0 is a horrible mess Unicode-wise and I'd never
> ever try to follow it. Python is my favorite programming language, but
> I'd never call its Unicode support a "good example".
But I think Jonathan's Python example can act as an indicator that
slnunicode does things wrong, or at least it doesn't comply to
conventions even if its behaviour is documented.
In Python len and index always keep consistent about what the length of
the string is, whether they treat it as a UTF-8 string or a byte sequence.
>> '\xc3\xa4b' äb 3 2
>> '\xc3\xb6\xc3\xa4b' öäb 5 4
>> u'\xe4b' äb 2 1
>> u'\xf6\xe4b' öäb 3 2
 I know that a sample of one language might be a bit weak an
argument. But I have yet to see a language supporting slnunicode's
behaviour. Or a use-case, that doesn't qualify as a programming error.
More information about the luatex