[XeTeX] Adding some basic programming functionality (mostly ICU string operations) to XeTeX?

Michiel Kamermans pomax at nihongoresources.com
Wed Dec 23 13:55:16 CET 2009


Hi all,

in addition to the things that already makes XeTeX unique 
(intercharclass behaviour, system fonts), could a request for string 
operations be considered? There's two things I'm missing in TeX, one 
being the ability to perform normal arithmetic, the other being string 
operations. Both of these I expect to be properly tackled by the LUATeX 
project, except that's roadmapped to be "done" in 2012...

How hard would it be to implement -at least- string operations in XeTeX 
with a few additional special xetex macros? Something that lets one use 
special commands along the lines of:

\XeTeXissubstring{fragment}{string}- true if fragment in string, false 
if not ('needle in haystack' argument ordering)
\XeTeXcountsubstrings{fragment}{string} - 0 if fragment not in string, n 
if fragment in string n times. (different functionality from issubstring 
- issubstring should not rely on countsubstring>0 =)
\XeTeXsubstring{string}{pos}{length} - generates a substring starting at 
(unicode) character position 'pos' (starting at 0?) of length 'length'.
\XeTeXstringlength{string} - gives the string length (in terms of 
unicode glyphs)

I know that in plain LaTeX the ifsubstring and countsubstring ideas are 
available through the "substr" package, but this package does not offer 
constructive functionality, only evaluative functionality. Having to 
rely on a package that doesn't even come close to proper string 
operations, when the ICU library that XeTeX relies on already offers all 
the unicode-correct functions that one might want to work with, strikes 
me as rather odd from an end-user point of view =)

ICU also implements regular expression processing, so ideally -in 
addition to simple string operations- there could also be two special 
XeTeX commands for working with regexp, such as:

\XeTeXrematch{fragment}{string}{modifiers} - true if matched, false if not
\XeTeXrereplace{search}{replace}{string}{modifiers} - replaces all 
instances of the search pattern with the replace pattern in the given 
string.

(As for arithmetics, a simple set of floating point \XeTeXadd{1}{2}, 
\XeTeXsubtract{1}{2}, \XeTeXmultiply{1}{2}, \XeTeXdivide{1}{2} and 
perhaps \XeTeXpower{1}{2} and \XeTeXroot{1}{2} with arguments 1 and 2 
being pure numerals would already be incredibly useful. For these kind 
of operations, there is no point in relying on counters, when an 
immediate evaluation can be performed. A little care might be required 
for dealing with nested arithmetic operations, but as long as the 
arguments are purely numerical, inner-most first evaluation should make 
this a non-problem)

Would this be difficult?

- Mike "Pomax" Kamermans
nihongoresources.com


More information about the XeTeX mailing list