[luatex] problem with slnunicode's find

Stephan Hennig mailing_list at arcor.de
Mon Mar 1 23:17:48 CET 2010


Am 01.03.2010 19:42, schrieb Patrick Gundlach:

>> I would expect the positions of 'b' being 2 and 3, resp., as that
>> are the lengths of the strings as returned by unicode.utf8.len.
>> However, unicode.utf8.find seems to have another notion of the
>> length of a string.
>
> It is documented: (Well, sort of, you need to downlaod the slunicode
> library and look into 'unittest'.)

Thanks for the pointer!


> --	NOTE: find positions are in bytes for all ctypes!
 > --	use ascii.sub to cut found ranges!

Hmm, neither do I want to cut something nor do I have a range available. 
  I just want to count.  Attached is my attempt of a utf8 aware find 
function based on the utf8 aware parts of slnunicode.  Comments and 
improvements are welcome!


> --	this is a) faster b) more reliable

But leaves this simple case uncovered. :/

Best regards,
Stephan Hennig
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: utf8_find.lua
URL: <http://tug.org/pipermail/luatex/attachments/20100301/6c7e6373/attachment-0002.pl>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: words.utf8
URL: <http://tug.org/pipermail/luatex/attachments/20100301/6c7e6373/attachment-0003.pl>


More information about the luatex mailing list