[luatex] problem with slnunicode's find
Stephan Hennig
mailing_list at arcor.de
Mon Mar 1 23:17:48 CET 2010
Am 01.03.2010 19:42, schrieb Patrick Gundlach:
>> I would expect the positions of 'b' being 2 and 3, resp., as that
>> are the lengths of the strings as returned by unicode.utf8.len.
>> However, unicode.utf8.find seems to have another notion of the
>> length of a string.
>
> It is documented: (Well, sort of, you need to downlaod the slunicode
> library and look into 'unittest'.)
Thanks for the pointer!
> -- NOTE: find positions are in bytes for all ctypes!
> -- use ascii.sub to cut found ranges!
Hmm, neither do I want to cut something nor do I have a range available.
I just want to count. Attached is my attempt of a utf8 aware find
function based on the utf8 aware parts of slnunicode. Comments and
improvements are welcome!
> -- this is a) faster b) more reliable
But leaves this simple case uncovered. :/
Best regards,
Stephan Hennig
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: utf8_find.lua
URL: <http://tug.org/pipermail/luatex/attachments/20100301/6c7e6373/attachment-0002.pl>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: words.utf8
URL: <http://tug.org/pipermail/luatex/attachments/20100301/6c7e6373/attachment-0003.pl>
More information about the luatex
mailing list