[XeTeX] xunicode for maths

Jonathan Kew jonathan_kew at sil.org
Wed Feb 15 12:54:34 CET 2006


Hi Will -

On 14 Feb 2006, at 12:27 am, Will Robertson wrote:

> Hello,
>
> As you know, Ross Moore's excellent package xunicode provides macro  
> names for a very large number of unicode characters. (Wait.  
> Jonathan, do I need to call them "glyphs"? Sorry, anyway:)

:)

No, they're Unicode characters. They get mapped to (font-specific)  
glyphs in order to become visible, though.

> I would like to propose a xunicode-like package to provide for  
> mapping macros to unicode maths glyphs, for two reasons:
>  - It allows us to experiment with existing methods, to see how far  
> we can stretch XeTeX's current interface to deal with unicode maths;
>  - At a later date, it provides the groundwork for a proper  
> solution, provided the implementation now is well conceived.

Good to see you tackling this!

> Please find attached a .dtx file that can be compiled with XeTeX 
> +LaTeX (you must have Code2001 installed, the only freely available  
> font I know that contains unicode maths characters). It provides a  
> skeleton of what a unicode maths package might become.

One comment: please consider expressing the character codes you're  
accessing as true Unicode Scalar Value numbers, rather than pairs of  
surrogate codes. Those are an implementation detail because we happen  
to be using the UTF-16 encoding form, but the characters are best  
identified by their simple USVs. That'll make maintenance a lot  
easier, too, as the numbers will correspond directly to those seen in  
the Unicode charts, etc.

To actually render a character from its USV in XeTeX, all you need is  
a macro that identifies supplementary-plane values and converts them  
to surrogate pairs; here's a little sample:

% - - - - - usv.tex - - - - -
%!TEX TS-program = xetex

\def\USV#1{{\uppercase{\count1="#1 }%
   \ifnum\count1<"FFFF
     \ifnum\count1<"D800 \char\count1 \else
       \ifnum\count1>"DFFF \char\count1 \else
         \errmessage{Isolated surrogate code}\fi \fi
   \else
     \ifnum\count1>"10FFFF
       \errmessage{USV out of range 0000 .. 10FFFF}
     \else
       \advance\count1 by -"10000
       \count2=\count1 \divide\count2 by 1024
       \count3=\count2 \advance\count3 by "D800
       \multiply \count2 by 1024
       \advance\count1 by -\count2
       \advance\count1 by "DC00
       \char\count3 \char\count1 \fi \fi}}

\font\A="LiSong Pro" at 24pt \A
% CJK extensions
\USV{20094}\USV{20068}\USV{279dd}

\font\B="Code2001" at 24pt \B
% some math alphabet samples
\USV{1d400} \USV{1d41a} \USV{1d434} \USV{1d44e}
\USV{1d468} \USV{1d482} \USV{1d49c} \USV{1d4b6}
\USV{1d4d0} \USV{1d4ea} \USV{1d504} \USV{1d51e}
\USV{1d538} \USV{1d552} \USV{1d56c} \USV{1d586}
\USV{1d5a0} \USV{1d5ba} \USV{1d5d4} \USV{1d5ee}
\USV{1d608} \USV{1d622} \USV{1d63c} \USV{1d656}
\USV{1d670} \USV{1d68a}

% How about some Linear B, a little Cypriot, and Aegean numbers... :)
\USV{10000}\USV{10001}\USV{10002}\USV{10024}\USV{10083}
\USV{10800}\USV{10801}\USV{10802}
\USV{10119}\USV{10111}\USV{10109}

% Old Italic, Gothic, Ugaritic, Deseret, Shavian, Osmanya...
\USV{10300}\USV{10301}\USV{10302}\USV{10303}
\USV{10330}\USV{10331}\USV{10332}\USV{10333}
\USV{10380}\USV{10381}\USV{10382}\USV{10383}
\USV{10400}\USV{10401}\USV{10402}\USV{10403}
\USV{10428}\USV{10429}\USV{1042a}\USV{1042b}
\USV{10450}\USV{10451}\USV{10452}\USV{10453}
\USV{10480}\USV{10481}\USV{10482}\USV{10483}

\end
% - - - - - - end - - - - - -



More information about the XeTeX mailing list