[XeTeX] Space characters and whitespace

Tobias Schoel liesdiedatei at googlemail.com
Thu Mar 3 16:43:45 CET 2011

Am 03.03.2011 14:38, schrieb mskala at ansuz.sooke.bc.ca:
> On Thu, 3 Mar 2011, Philip Taylor (Webmaster, Ret'd) wrote:
>> if XeTeX is predicated on the use of Unicode, it should
>> "understand" the semantics of Unicode code points such
>> as u2009 and u202F and just do the right thing without
>> having to hack things through the use of active characters.
> There are really three separate questions here:
> 1.  Should XeTeX accept Unicode's many space characters in the input, as
> opposed to just using existing TeX commands for spacing?
> 2.  If the answer to #1 is "yes," what exactly should be the consequences
> of each space character?  Grouping this under the same question because
> it's related:  to what extent should the font determine the width and
> nature of each space character?
> 3.  If the answer to #1 is "yes," how should this be implemented - in
> particular, should it be in the engine or by macros?
> It sounds like you're saying "yes" to #1 and "in the engine, not in the
> macros" to #3.
> I note we already had some of this discussion just for the ordinary space
> character (U0020) in the monospace/punctuation space thread.  To typeset
> word and sentence spaces properly, XeTeX needs information that it can't
> get from the font.  It seems like similar issues may arise with all the
> other space characters.

I don't think, the width of these chars should come from the font, but 
neither does unicode specify this. Unicode says among other things:
u2003: EM SPACE: nominally, a space equal to the type size in points
u2005: FOUR-PER-EM SPACE: = mid space
u2009: THIN SPACE: a fifth of an em (or sometimes a sixth) -> u202f
u202f: NARROW NO-BREAK SPACE: a narrow form of a no-break space, 
typically the width of a thin space or a mid space
The EM SPACE does not need to be exactly the type size. All the other 
spaces are defined relatively to the EM SPACE. So XeTeX could stick to 
these relativ values during justification. This should be easy for the 

The real problem is more of semantic nature: when should a space be 
stretched during justification and when not? The decimal space in the 
number 123456 should not be stretched but the space between the numbers 
123 and 456 should be stretched. How can XeTeX tell, which is the case?



More information about the XeTeX mailing list