On Wed, Mar 03, 2010 at 01:57:38AM +0100, luigi scarso wrote:
> On Tue, Mar 2, 2010 at 9:28 PM, Khaled Hosny <khaledhosny at eglug.org> wrote:
> > What you are saying is an absolute
> > non-sense;
> > a utf-8 aware function that act on byte sequences is broken
> > period. There is nothing to argue here.
> #> man perlre
>   \C  Match a single C char (octet) even under Unicode.
>                NOTE: breaks up characters into their UTF-8 bytes,
>                so you may end up with malformed pieces of UTF-8.
>                Unsupported in lookbehind.

But, AFAICS, this is a special case so it doesn't count here;
unicode.utf8.find() has only a bytes mode, IMNSHO, this makes it broken.
(And we already have string.find() for this kind of functionality)


