[luatex] problem with slnunicode's find
luigi scarso
luigi.scarso at gmail.com
Wed Mar 3 11:52:27 CET 2010
> I wanted to mail you off-list, anyway. It was just late yesterday. Here is
> an example:
>
> str = "ä#Ö"
> print("str: ", str)
>
> -- This considers 'Ö' a single upper-case letter, i.e.,
> -- 'Ö' is one (character) long.
> print('match("%u"): ', unicode.utf8.match(str, "(%u)"))
> -- Like len does.
> print('len("Ö"): ', unicode.utf8.len("Ö"))
>
> -- This returns the byte position of 'Ö' in the string, i.e.,
> -- it considers the length of 'ä' as two (bytes).
> print('match("()%u"): ', unicode.utf8.match(str, "()%u"))
> -- Unlike len.
> print('len("ä"): ', unicode.utf8.len("ä"))
>
>> >texlua empty.lua
>> str: ä#Ö
>> match("%u"): Ö
>> len("Ö"): 1
>> match("()%u"): 4
>> len("ä"): 1
>
> Note, the empty capture () doesn't return a match, but its position within a
> string in case of a match, similar to find. So, no surprise it returns byte
> positions. But one can argue, if that is documented behaviour.
Good -- I will check it today
--
luigi
More information about the luatex
mailing list