[luatex] [OT] The consumption of an input string.
luigi scarso
luigi.scarso at gmail.com
Tue Jun 18 14:18:36 CEST 2013
On Tue, Jun 18, 2013 at 12:41 PM, Paul Isambert <zappathustra at free.fr>wrote:
> But then “abc” should be represented as “[ϵ][a][ϵ][b][ϵ][c][ϵ]” and
> “string.gsub("abc", ".*", "(%0)")” should return “()(abc)” or
> something like that? I’ll admit I can’t really get my head around it.
>
abc=
target=[ϵ][a][ϵ][b][ϵ][c][ϵ]
pattern =ϵ | [^ϵ]+ (not sure about \n for Lua )
ϵaϵbϵc=abc =target[1]&target[2]&...&target[6] (low level loop greedy) =>
(ϵaϵbϵc) = (abc)
ϵ=target[7] => (ϵ) =()
Why not the ϵ after c ?
>From http://perldoc.perl.org/perlre.html
"By default, a quantified subpattern is "greedy", that is, it will match as
many times as possible (given a particular starting location) while still
allowing the rest of the pattern to match"
or as in
Repeated Patterns Matching a Zero-length Substring :
"The lower-level loops are *interrupted* (that is, the loop is broken) when
Perl detects that a repeated expression matched a zero-length substring. "
Here the lower-level loops are those associated with the *+{} greedy
quantifiers .
In this case ϵ after c is the zero-lenght substring and the the string is
finished, so the first match is abc.
The global switch /g is the higher level loop:
"The higher-level loops preserve an additional state between iterations:
whether the last match was zero-length. To break the loop, the following
match after a zero-length match is prohibited to have a length of zero.
This prohibition interacts with backtracking (see
Backtracking<http://perldoc.perl.org/perlre.html#Backtracking>),
and so the *second best* match is chosen if the *best* match is of zero
length."
which seems to be the case of the last ϵ.
(at least for m; needless to say the I reading the manual every time...)
Well, here things become strange. The regex in Vim script based on
> Perl, as far as I can tell. Now
>
> substitute('abc', '.*', '(\0)', 'g')
>
> returns the not-Perl-like “(abc)” instead of “(abc)()”, however
>
> substitute(';a;', 'a*', 'ITEM', 'g')
>
> returns the Perl-like “ITEM;ITEMITEM;ITEM” instead of the expected
> not-Perl-like “ITEM;ITEM;ITEM”.
>
> Well, I think I’ll write to the Vim list! :)
>
> I've tried with \zs\ze .. no luck.
--
luigi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/luatex/attachments/20130618/6ddbf63d/attachment.html>
More information about the luatex
mailing list