[luatex] [OT] The consumption of an input string.
Paul Isambert
zappathustra at free.fr
Mon Jun 17 23:42:42 CEST 2013
Dirk Laurie <dirk.laurie at gmail.com> a écrit:
> 2013/6/17 Paul Isambert <zappathustra at free.fr>:
>
> > This is not really a LuaTeX question, but I ask it here anyway since a
> > lot of knowledgeable people read this list.
> >
> > I’ve been surprised to discover that
> >
> > print(string.gsub('abc', '.*', '(%0)'))
> >
> > returns
> >
> > (abc)()
> >
> > (similarly, “string.gmatch('abc', '.*')” returns two matches). I’d
> > expect
> >
> > (abc)
> >
> > since the string is completely consumed after the first match and
> > there’s no reason to try matching any further. I thought it was a Lua
> > quirk but then in Ruby
> >
> > puts 'abc'.gsub(/.*/, '(\0)')
> >
> > returns the same thing. On the other hand, “(abc)” is returned as
> > expected (by me) with
> >
> > echo substitute('abc', '.*', '(\0)', 'g')
> >
> > in Vim script and
> >
> > import re
> > print re.sub(re.compile('(.*)'), '(\\1)', 'abc')
> >
> > in Python and
> >
> > echo "abc" | sed 's/.*/(\0)/g'
> >
> > with sed (I’m not familiar with Python and sed, so the last two codes
> > are only tentative).
>
> In my opinion this is a case of an early implementation of regular
> expressions (possibly of Perl) becoming a de facto standard. Nobody
> realized at the time that there is an ambiguity, and it is too late
> to change now.
>
> Perl has since spelt it out, casting in concrete the behaviour you
> (and I) consider counter-intuitive) but many other languages just
> leave the issue vague.
>
> LuaTeX does it that way because Lua does it that way. There was a
> discussion on this very topic on the Lua users list about a month
> ago, people weighed in with arguments on both sides, and nothing
> will change.
Thank you Dirk for the explanation. I find the whole thing terribly
counter-intuitive. The following:
local c = 0
for match in string.gmatch("a,b,c", "[^,]*") do
c = c+1
print(c, match)
end
results in 6 matches!
For those interested in the discussion mentionned (and actually
launched) by Dirk, here it is:
http://lua-users.org/lists/lua-l/2013-04/msg00812.html
Best,
Paul
More information about the luatex
mailing list