[luatex] String manipulation in Lua.

Philipp Gesang pgesang at ix.urz.uni-heidelberg.de
Fri Dec 3 00:22:33 CET 2010


On 2010-12-02 <20:59:51>, Paul Isambert wrote:
> local sub, gsub = string.sub, string.gsub
> function isub (str, pattern, replace, index, num)
>   -- Extract the suffix starting at given index
>   local s1 = sub(str, index)
>   -- Make the replacement on the suffix
>   local s2 = gsub(s1, pattern, replace, num)
>   -- Replace the suffix in the string with it modified version
>   return gsub(str, s1 .. "$", s2)
> end
> 
> It works, but I find the solution an overkill for what seems to be a
> basic operation. So, as I like to ask: have I missed something?

Hi Paul,

“string.sub” takes an optional third argument so that you can do
something like this:

···8<····························································

local sub, gsub = string.sub, string.gsub
function isub (str, pattern, replace, index, num)
  local left  = sub(str, 1, index-1)
  local right = sub(str,index):gsub(pattern, replace, num)
  return left .. right
end

print(isub("abc)abc%def-def",   "b",   "B", 4))
print(isub("abc)abc%def-def", "def", "FED", 1, 1))

···8<····························································

If you encounter problems with magic characters in patterns,
there is some assistance waiting in the context helper libs:
http://wiki.contextgarden.net/String_Manipulation#string.escapedpattern.28string.29_.7C_string.partialescapedpattern.28string.29

Regards, Philipp


PS: What’s the problem with lpeg, anyways?

> (Yes, probably LPeg, but I don't want to go into that for the
> moment.)

···8<····························································

local lpeg = require "lpeg"
local Cmt, Cs, P, V = lpeg.Cmt, lpeg.Cs, lpeg.P, lpeg.V

local function lpeg_gsub (str, pattern, replacement, threshold, limit)
    local idx = 1
    local threshold = threshold or 0
    local sub_cnt = 0

    local peg = P{
        [1] = "initial",

        initial = Cs((V"p" + V"other")^0),

        p = Cmt(P(pattern), function (_,_, matched)
                if idx >= threshold and
                   ( (limit ~= nil) and (sub_cnt < limit) or (limit == nil) )
                   then
                    idx = idx + #matched
                    sub_cnt = sub_cnt + 1
                    return true
                end
                return false
            end) / replacement,

        other = Cmt(1, function () idx = idx + 1 return true end)
    }

    --peg:print()
    return peg:match(str)
end


local test = "abcdefg abcdefg abcdefg abcdefg"

print(lpeg_gsub(test, "def", "FED", 10))
print(lpeg_gsub(test, "def", "FED", 07, 2))
print(lpeg_gsub(test, "def", "FED", 15))

io.write("\n")

print(lpeg_gsub(test,   "a",   "A", 1, 2))
print(lpeg_gsub(test,   "b",   "B", 4, 1))
print(lpeg_gsub(test,   "c",   "C", 9, 0))
print(lpeg_gsub(test,   "d",   "D", 9))

io.write("\n")

local p1 = P"a" * P(1 - P"g")^1 * P"g"
local p2 = lpeg.S"bdf"
local p3 = lpeg.R"dg"

print(lpeg_gsub(test, p1, "O", 9, 1))
print(lpeg_gsub(test, p2, "O", 1, 3))
print(lpeg_gsub(test, p3, "O", 22))

···8<····························································


-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://tug.org/pipermail/luatex/attachments/20101203/0ade7160/attachment.bin>


More information about the luatex mailing list