[XeTeX] XeTeX maintenance
Douglas McKenna
doug at mathemaesthetics.com
Mon Apr 27 02:05:45 CEST 2015
Joseph Wright wrote:
> \def\"{0}\expandafter\def\csname^^^^^00022\endcsname{1}
> \ifnum\"=0 \message{tex82}\else\message{newstuff}\fi
When I implemented a Unicode escape sequence extension using double-caret notation in the JSBox TeX-language interpreter I've been working on (which is all 21-bit Unicode internally, all the time, but can be configured at run-time to be 8-bit input only), I was unaware of what XeTeX had implemented, so I just used
^^uxxxx (for 16-bit, BMP codes)
^^Uxxxxxx (for all 21-bit Unicode code points)
Seemed straightforward enough.
In the first case, if any one of the four 'x's is not a lowercase hex digit, interpretation reverts to the standard TeX escape sequence ^^u (ASCII '5'), followed by four input characters, at least one of which is not a hex digit. Similarly for the six hex digit case, for whatever character ^^U converts to, if at least one of the six characters following is not a hex digit.
Given that the number of TeX input files using ^^u is likely miniscule, and the number of those that follow the ^^u or ^^U with four or six hex digits is even smaller, it seemed like a worthwhile benefit vs. cost, compatibility-wise. Maybe there's something I've not thought out well.
This discussion I just found is both pertinent and frightening, I suppose:
http://stackroulette.com/tex/62725/the-notation-in-various-engines
Doug McKenna
More information about the XeTeX
mailing list