[luatex] problem with slnunicode's find
luigi scarso
luigi.scarso at gmail.com
Wed Mar 3 01:57:38 CET 2010
On Tue, Mar 2, 2010 at 9:28 PM, Khaled Hosny <khaledhosny at eglug.org> wrote:
> What you are saying is an absolute
> non-sense;
> a utf-8 aware function that act on byte sequences is broken
> period. There is nothing to argue here.
#> man perlre
\C Match a single C char (octet) even under Unicode.
NOTE: breaks up characters into their UTF-8 bytes,
so you may end up with malformed pieces of UTF-8.
Unsupported in lookbehind.
# perl -e 'use utf8; binmode(STDOUT, ":utf8"); $s="\x{0110}"; $s =~
m/(\C)(\C*)/; print "$s: 1=",length($1)," 2=",length($2),"\n"'
Đ: 1=1 2=1
#
I argue that it's not a non-sense, but only for experts .
--
luigi
More information about the luatex
mailing list