[XeTeX] Space characters and whitespace

Tobias Schoel liesdiedatei at googlemail.com
Thu Mar 3 16:43:45 CET 2011



Am 03.03.2011 14:38, schrieb mskala at ansuz.sooke.bc.ca:
> On Thu, 3 Mar 2011, Philip Taylor (Webmaster, Ret'd) wrote:
>> if XeTeX is predicated on the use of Unicode, it should
>> "understand" the semantics of Unicode code points such
>> as u2009 and u202F and just do the right thing without
>> having to hack things through the use of active characters.
>
> There are really three separate questions here:
>
> 1.  Should XeTeX accept Unicode's many space characters in the input, as
> opposed to just using existing TeX commands for spacing?
>
> 2.  If the answer to #1 is "yes," what exactly should be the consequences
> of each space character?  Grouping this under the same question because
> it's related:  to what extent should the font determine the width and
> nature of each space character?
>
> 3.  If the answer to #1 is "yes," how should this be implemented - in
> particular, should it be in the engine or by macros?
>
> It sounds like you're saying "yes" to #1 and "in the engine, not in the
> macros" to #3.
>
> I note we already had some of this discussion just for the ordinary space
> character (U0020) in the monospace/punctuation space thread.  To typeset
> word and sentence spaces properly, XeTeX needs information that it can't
> get from the font.  It seems like similar issues may arise with all the
> other space characters.

I don't think, the width of these chars should come from the font, but 
neither does unicode specify this. Unicode says among other things:
„
u2003: EM SPACE: nominally, a space equal to the type size in points
...
u2005: FOUR-PER-EM SPACE: = mid space
...
u2009: THIN SPACE: a fifth of an em (or sometimes a sixth) -> u202f
...
u202f: NARROW NO-BREAK SPACE: a narrow form of a no-break space, 
typically the width of a thin space or a mid space
“
The EM SPACE does not need to be exactly the type size. All the other 
spaces are defined relatively to the EM SPACE. So XeTeX could stick to 
these relativ values during justification. This should be easy for the 
engine.

The real problem is more of semantic nature: when should a space be 
stretched during justification and when not? The decimal space in the 
number 123456 should not be stretched but the space between the numbers 
123 and 456 should be stretched. How can XeTeX tell, which is the case?

ciao

Toscho


More information about the XeTeX mailing list