[XeTeX] \meaning on an outside BMP character

Qing Lee sobenlee at gmail.com
Sun Jun 30 08:20:13 CEST 2013


Hi,

It is relevant to the subject below.

http://tug.org/pipermail/xetex/2013-January/023967.html

I want to catch the character code of an implicit token by using
the \meaning trick. But it seems that \meaning of XeTeX does not
return the expected result on an outside BMP character (>= 0x10000).

My XeTeX version is

XeTeX 3.1415926-2.5-0.9999.3-2013060713 (TeX Live 2013/W32TeX)

Here is a MWE.

\newcount\cnta
\newcount\cntb
\newcount\cntc
\def\GetCharcode{\expandafter\GetCharcodeAux\meaning\testchar}
\let\testchar=^^^^^1d6fc %% \mitalpha
\ifdefined\directlua
\def\GetCharcodeAux#1 #2 #3{\cnta=`#3 }
\GetCharcode
\showthe\cnta %% -> 120572 0x1D6FC
\else
\def\GetCharcodeAux#1 #2 #3#4{\cntb=`#3 \cntc`#4 }
\GetCharcode
\showthe\cntb %% -> 55349  0xD835
\showthe\cntc %% -> 57084  0xDEFC
%% convert UTF-16 to unicode
\cnta=\numexpr ( \cntb - "D800 ) * "400 + ( \cntc - "DC00 ) + "10000 \relax
\showthe\cnta %% -> 120572 0x1D6FC
\fi
\bye

LuaTeX works fine with characters outside BMP. It seems that XeTeX
return the UTF-16 code units instead of the real character outside BMP.

This is intentional or an oversight?

Regards,
-- 
Qing Lee


More information about the XeTeX mailing list