[XeTeX] Unicode space characters

Jonathan Kew jfkthame at googlemail.com
Wed Mar 18 10:33:00 CET 2009


On 18 Mar 2009, at 09:10, Ulrike Fischer wrote:

> Am Wed, 18 Mar 2009 00:39:27 +0000 (UTC) schrieb Tomáš Janoušek:
>
>> Hello Unicode TeXers,
>>
>> I created a set of definitions for a few Unicode "space" characters  
>> and I
>> think these may be interesting to the community, and could possibly  
>> make it to
>> the distribution. I'd like to hear what you think about them and  
>> whether they
>> are correct (as I am not experienced in typography, and I have no  
>> idea about
>> non-Europian languages).
>
>
>>> %% U+00A0 NO-BREAK SPACE;  
>>> %%   Unicode char for ~.
>>> \catcode`^^^^00a0=\active
>>> \def^^^^00a0{\nobreakspace}
>
>
> Are the definitions necessary? That means how does XeTex handle
> normally e.g. U+00A0 NO-BREAK SPACE?  Can  there be a line break
> before or after this input?

XeTeX has no special built-in knowledge about U+00A0 or the various  
other Unicode space-like characters; it will simply "print" them in  
the current font. Which would be fine, except that some fonts fail to  
support them, in which case you'll get a .notdef glyph. :(

Defining these in a font-independent way using TeX seems like a good  
idea in general; however, care may be needed to make them work  
correctly in all contexts, particularly when they occur in text that  
ends up going to the LaTeX .aux file, etc., or into PDF bookmarks. I  
haven't really looked into this, not being a serious LaTeX user, just  
wondering......

JK



More information about the XeTeX mailing list