[luatex] Behavior of slnunicode.utf8.match().

Patrick Gundlach patrick at gundla.ch
Mon Aug 8 10:11:45 CEST 2011


Hi Paul,

this is how _I_ understand slunicode:

"." always matches a single byte, because the string functions (below) also work with arbitrary binary data. 


 	• find
	• match
	• gmatch
	• gsub

from slunicode have different category classes (http://www.unicode.org/Public/4.0-Update1/UCD-4.0.1.html#General_Category_Values)

So you should use %a or something to match a whole utf8 encoded item. Or you can use unicode.utf8.sub(str,n,n) to get the nth   utf8 character.

So I'd consider this correct behavior, but I have had some discussion on this before where I am pretty alone with my opinion... :) 

http://tug.org/pipermail/luatex/2010-March/thread.html#1242


Patrick




More information about the luatex mailing list