[luatex] Superscript notation for characters.

Paul Isambert zappathustra at free.fr
Mon Sep 26 10:30:34 CEST 2011


Hello all,

Here's another strange behavior I've encountered (or normal behavior 
I've misunderstood):
Both XeTeX and LuaTeX extends the ^^ mechanism so that:

^^^^abcd accesses characters 0-65535;
^^^^^abcde accesses characters 0-1048575;
^^^^^^abcdef (where abcdef is at most 10ffff) accesses characters 0-1114111.

As an illustration:

%%%%
\def\shownumber#1{%
   \count1=`#1
   \immediate\write16{\the\count1}
   }
\shownumber{^^^^ffff} % 65535
\shownumber{^^^^^fffff} % 1048575
\shownumber{^^^^^^10ffff} % 1114111
%%%%

Fine. Now I'm trying to understand how those rows of superscripts are 
parsed. All engines (i.e. TeX and PDFTeX too) parse:

\shownumber{^^^^^ }

as "(^^^)(^^ )", i.e. <char 30> (^^^, i.e. ^ - 64), so "30" is sent to 
the terminal, plus <char 96> (^^<space>, i.e. <space> + 64), i.e. `, 
which is typeset. But if you remove the braces (and add numbers so the 
space isn't replaced with an end-of-line, and for another reason 
explained presently):

\shownumber^^^^^ 12345679

then all engines behave as before, plus they typeset the numbers, except 
LuaTeX, which sends 94 to the terminal and typesets only "56789". With

\shownumber^^^^^^ 12345679

the same happens except only "6789" is typeset.

What I suppose is: LuaTeX sees the "^^^^^" sequence, so it takes the 
next five characters (including the space); but too bad, they don't fit 
hexadecimal notation (because of the space), so it discards them (hence 
their disappearance) and for some strange reason analyses "^^^^^" 
backward, i.e. "^^(^^^)" = "^^<char 30>" = "<char 94>". This does not 
happen with braces because the argument ends and the ^^ replacement 
occurs after that. As for the 6-^ version, it gobbles the next six 
characters, and then ends up with <char 94> I don't know how.

Well, what's going wrong, LuaTeX, or my head?

Best,
Paul


More information about the luatex mailing list