[luatex] problem with slnunicode's find

Jonathan Fine jfine at pytex.org
Thu Mar 4 21:08:56 CET 2010


Stephan Hennig wrote:
> Am 04.03.2010 09:15, schrieb Jonathan Fine:
>> Stephan Hennig wrote:
>>>
>>>   >  >texlua slnunicode-find.lua
>>>   >  line = äb
>>>   >  len(line) = 2
>>>   >  character 'b' at position 3
>>>   >
>>>   >  line = ├Â├ñb
>>>   >  len(line) = 3
>>>   >  character 'b' at position 5
>>>
>>> I would expect the positions of 'b' being 2 and 3, resp., as that are
>>> the lengths of the strings as returned by unicode.utf8.len.
>>
>>
>> Stephan: Is this what you want (except of course in Lua)?
>>
>> $ python
>> Python 2.6.2 (release26-maint, Apr 19 2009, 01:58:18)
>> [GCC 4.3.3] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>   >>>  data = 'äb', 'öäb', u'äb', u'öäb'
>>   >>>  data
>> ('\xc3\xa4b', '\xc3\xb6\xc3\xa4b', u'\xe4b', u'\xf6\xe4b')
>>   >>>  for s in data: print repr(s), s, len(s), s.index('b')
>> ...
>> '\xc3\xa4b' äb 3 2
>> '\xc3\xb6\xc3\xa4b' öäb 5 4
>> u'\xe4b' äb 2 1
>> u'\xf6\xe4b' öäb 3 2
>>
>> In the above we have the two strings, first in 8-bit form and then in
>> unicode.
> 
> If strings start with index zero in Python, then yes, the second variant 
> is what I'm after.

Yes, Stephan, they do start at zero.  So for Unicode in Lua, Python 
would be a good example to study and perhaps follow.


Jonathan



More information about the luatex mailing list