[XeTeX] Clarification on XeTeX documentation

Doug McKenna doug at mathemaesthetics.com
Thu Dec 12 17:53:00 CET 2019


Two questions:

----------------
Question #1:

In the latest document describing XeTeX extensions, dated 2019-12-09, for instance, at

<https://ctan.math.illinois.edu/info/xetexref/xetex-reference.pdf>,

in section 2.3 "Maths fonts" (currently on page 14), the following sentence needs clarification:

>| In the following commands, ⟨fam.⟩ is a number (0–255) representing
>| font to use in maths. ⟨math type⟩ is the 0–7 number corresponding to
>| the type of math symbol ...

But <fam.> is not a font number (or index).  As <fam.> denotes, it is a font family number (or index), where each font family represents a triplet of loaded fonts, one each for text, script, and scriptscript situations.

And throughout other TeX documentation, the word "class" is used to describe the purpose of a math character, a 3-bit number between 0 and 7.

I suggest this be amended to read:

In the following commands, ⟨fam.⟩ is a number (0–255) of the math font family. ⟨math type⟩ is the 0–7 number corresponding to the class of math symbol ...

----------------
Question #2:

Later on, in various syntax declarations, e.g.,

>| \Umathcode⟨char slot⟩ [=] ⟨math type⟩ ⟨fam.⟩ ⟨glyph slot⟩

one finds the term <glyph slot>.  This is curious, because XeTeX's source code parses this integer as an integer, using a procedure named scan_usv_num ("usv" stands for Unicode scalar value).  That routine complains about any value outside the Unicode range of 0 to "10FFFF as illegal.

But glyph slot is a term usually used to describe the innards of a font, and is not the same as a Unicode character/code point/scalar value, which the font would internally map to a glyph slot (or index).  Also, every OpenType font is limited to no more than 2^{16} (65536) glyph slots, so it's concerning that this routine accepts a number that is outside of that range.

If this is the case, another problem is that it is then formally possible that a font contains a glyph whose internal slot number, for example, might be "D800 (a legal 16-bit value that scan_usv_num won't complain about).  But "D800 is not a legal Unicode character value, it's a high-surrogate value for forming a full 21-bit Unicode character value with another low surrogate value.  "D800 might be a Unicode scalar value, but it is not a character value.

So my question is:

What is a proper legal value for a <glyph slot>?  Alternatively, should <glyph slot> be changed in this documentation to something less ambiguous, such as <Unicode character value> or <Unicode code point> or <Unicode scalar value>?


Doug McKenna
Mathemaesthetics, Inc.



More information about the XeTeX mailing list