[XeTeX]   in XeTeX

Ulrike Fischer news3 at nililand.de
Sat Nov 12 12:58:35 CET 2011

Am Fri, 11 Nov 2011 16:33:20 +0100 schrieb Zdenek Wagner:

> I still do not understand the internal mechanism. I know how
> punctuation is handled in French, the category of a few characters is
> set to 13 and defined as some macros. But how can XeTeX regognize
> whether the space token with category 10 has to be converted to a
> nonbreakable space?

There was once a discussion about spaces on the xetex list starting


I don't know if the code discussed there led to a package or found
its way somehow in the format. 

I asked in this thread how spaces are handle and got this answer
from Jonathan:

>>> %% U+00A0 NO-BREAK SPACE;  
>>> %%   Unicode char for ~.
>>> \catcode`^^^^00a0=\active
>>> \def^^^^00a0{\nobreakspace}

> Are the definitions necessary? That means how does XeTex handle
> normally e.g. U+00A0 NO-BREAK SPACE?  Can  there be a line break
> before or after this input?

XeTeX has no special built-in knowledge about U+00A0 or the various  
other Unicode space-like characters; it will simply "print" them in  
the current font. Which would be fine, except that some fonts fail
to support them, in which case you'll get a .notdef glyph. :(

Defining these in a font-independent way using TeX seems like a good  
idea in general; however, care may be needed to make them work  
correctly in all contexts, particularly when they occur in text that  
ends up going to the LaTeX .aux file, etc., or into PDF bookmarks. I  
haven't really looked into this, not being a serious LaTeX user,
just wondering......

Ulrike Fischer 

More information about the XeTeX mailing list