[luatex] problem with slnunicode's find

luigi scarso luigi.scarso at gmail.com
Wed Mar 3 11:52:27 CET 2010


> I wanted to mail you off-list, anyway.  It was just late yesterday. Here is
> an example:
>
> str = "ä#Ö"
> print("str: ", str)
>
> -- This considers 'Ö' a single upper-case letter, i.e.,
> -- 'Ö' is one (character) long.
> print('match("%u"): ', unicode.utf8.match(str, "(%u)"))
> -- Like len does.
> print('len("Ö"): ', unicode.utf8.len("Ö"))
>
> -- This returns the byte position of 'Ö' in the string, i.e.,
> -- it considers the length of 'ä' as two (bytes).
> print('match("()%u"): ', unicode.utf8.match(str, "()%u"))
> -- Unlike len.
> print('len("ä"): ', unicode.utf8.len("ä"))
>
>> >texlua empty.lua
>> str:    ä#Ö
>> match("%u"):    Ö
>> len("Ö"):      1
>> match("()%u"):  4
>> len("ä"):      1
>
> Note, the empty capture () doesn't return a match, but its position within a
> string in case of a match, similar to find.  So, no surprise it returns byte
> positions.  But one can argue, if that is documented behaviour.

Good -- I will check it today


-- 
luigi



More information about the luatex mailing list