[XeTeX] sting manipulation macros

Michiel Kamermans pomax at nihongoresources.com
Sun Oct 19 12:35:07 CEST 2008

Hi all,

The feature I'm about to suggest would probably require extending the 
xetex engine itself a little bit, but given what we want to use it for 
(as all TeX, conditional typesetting) it might be a nice idea:

Would there be animo for having access to actual string manipulation 
functions, wrapped by macros, in the xetex engine? I am thinking 
primarily common things like string lenths, substring selecting, a 
string-per-character macro call, glyph string to bytecode string and 
vice versa (for glyph comparison and generation)..

I mainly ask because right now the fontwrap package I wrote is 
implemented in perl, and relies on perltex, which is a wonderfully 
"clever" way to mix technologies, but isn't without its own problems 
(like... having to rely on perl). The only reason I did this was because 
I needed to compare individual glyphs in sections of text to unicode 
block start and end markers, so that I could insert tex macros between 
characters of different unicode blocks (if the font definitions for 
those blocks were different). Since TeX itself does not offer 
by-character processing (the lack of a way to call some macro for every 
individual glyph in a string is inconvenient at best),adding built in 
support for substring selecting alone should already be enough to 
rewrite fontwrap as a pure xetex package... and in doing so, it would 
probably immediately offer a more flexible form of "character classes" 
than the current under-the-surface character classes concept used by 
XeTeX (which only has 4, hardcoded, classes).

Since all it would require would be an internal call from some macro 
\substring[3]{input}{start}{length} to the unicode version of the built 
in substring functions in C/C++, it shouldn't even a very timeconsuming 
job (of course, the function would have to be wrapped so that illegal 
substring calls fail gracefully, but that's only a minute's work, if 


- Mike "Pomax" Kamermans

More information about the XeTeX mailing list