[XeTeX] sting manipulation macros

Ross Moore ross at ics.mq.edu.au
Sun Oct 19 23:42:09 CEST 2008

Hi Michiel,

On 20/10/2008, at 6:54 AM, Michiel Kamermans wrote:

> Ross,
>> It sounds like TeX's pattern-matching, and \ifx conditional,
>> and other primitives such as \futurelet, are just what you want.
>> Several LaTeX packages rely on per-character parsing, to decide
>> what to do with specific kinds of data. Examine the internal
>> coding of these packages to see how they work.
>> Off the top of my head I can think of:
>>    hyperref.sty   converting the input stream to PDF strings
>>       e.g. for Bookmarks and other PDF elements;
>>    Xy-pic  is full of parsers that interpret character strings
>>            as programming code to include graphic elements.
>>    musictex.tex
>>    chess.sty
>>       and similar packages, need this kind of processing.
> Good recommendations, although my feature suggestion still stands...

Have you seen  LuaTeX :   http://www.luatex.org/  ?

There have

> from a programming point of view, the standard TeX programming  
> language
> is quite a pain (even Knuth admitted he'd have done it differently  
> if he
> had the programming languages available that we do these days =). If
> there were a few simple wrapper macros available it would certainly  
> make
> writing new packages that rely on string manipulation a remarkable lot
> less painful.

Another approach is to use TeX's \openout  and \write  primitives to
export data strings into a text file, then use the \write18  mechanism
to run an external (filter) command on that file's contents.
With the result having been written into a new file, then read this
back in using \openin  and \read  primitives.

This gives you access to whatever string manipulation programs exist
already on your computer system, without the need for any of them
to become an integral part of the TeX program itself.

This is probably the most flexible way for you to do what you want,
since you can write the filtering programs as shell scripts, or as
Perl/Python/Ruby/Lua/awk/sed/etc. programs, completely independent of  
Do all programming and debugging without the compatibility restrictions
of development within TeX's macro language.
Then you just need to program within TeX sufficient logic to decide
when you need external support, and which of the filters to run
on a particular portion of your input.

One problem with doing it this way though, is that your coding
is not easily portable to another platform or OS, where a different
set of tools might be available, using different syntax.

I've not actually used XeTeX's  interchartoks and charclass methods.
Can these be used to put control-sequences between characters,
which then initiate further programming with the subsequent tokens
as data ?  Such as initiating a particular filter, as discussed above?

I suspect this is not the case --- but if it were, then this might
help in specifying which filters to run at appropriate places
within your input stream. Worth a try, to see if it works.

> - Mike

Hope this helps,


Ross Moore                                       ross at maths.mq.edu.au
Mathematics Department                           office: E7A-419
Macquarie University                             tel: +61 (0)2 9850 8955
Sydney, Australia  2109                          fax: +61 (0)2 9850 8114

More information about the XeTeX mailing list