[luatex] problem with slnunicode's find
Jonathan Fine
jfine at pytex.org
Thu Mar 4 21:08:56 CET 2010
Stephan Hennig wrote:
> Am 04.03.2010 09:15, schrieb Jonathan Fine:
>> Stephan Hennig wrote:
>>>
>>> > >texlua slnunicode-find.lua
>>> > line = äb
>>> > len(line) = 2
>>> > character 'b' at position 3
>>> >
>>> > line = ├Â├ñb
>>> > len(line) = 3
>>> > character 'b' at position 5
>>>
>>> I would expect the positions of 'b' being 2 and 3, resp., as that are
>>> the lengths of the strings as returned by unicode.utf8.len.
>>
>>
>> Stephan: Is this what you want (except of course in Lua)?
>>
>> $ python
>> Python 2.6.2 (release26-maint, Apr 19 2009, 01:58:18)
>> [GCC 4.3.3] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> data = 'äb', 'öäb', u'äb', u'öäb'
>> >>> data
>> ('\xc3\xa4b', '\xc3\xb6\xc3\xa4b', u'\xe4b', u'\xf6\xe4b')
>> >>> for s in data: print repr(s), s, len(s), s.index('b')
>> ...
>> '\xc3\xa4b' äb 3 2
>> '\xc3\xb6\xc3\xa4b' öäb 5 4
>> u'\xe4b' äb 2 1
>> u'\xf6\xe4b' öäb 3 2
>>
>> In the above we have the two strings, first in 8-bit form and then in
>> unicode.
>
> If strings start with index zero in Python, then yes, the second variant
> is what I'm after.
Yes, Stephan, they do start at zero. So for Unicode in Lua, Python
would be a good example to study and perhaps follow.
Jonathan
More information about the luatex
mailing list