# [XeTeX] Unicode space characters

Jonathan Kew jfkthame at googlemail.com
Wed Mar 18 10:33:00 CET 2009

On 18 Mar 2009, at 09:10, Ulrike Fischer wrote:

> Am Wed, 18 Mar 2009 00:39:27 +0000 (UTC) schrieb Tomáš Janoušek:
>
>> Hello Unicode TeXers,
>>
>> I created a set of definitions for a few Unicode "space" characters
>> and I
>> think these may be interesting to the community, and could possibly
>> make it to
>> the distribution. I'd like to hear what you think about them and
>> whether they
>> are correct (as I am not experienced in typography, and I have no
>> non-Europian languages).
>
>
>>> %% U+00A0 NO-BREAK SPACE; &nbsp;
>>> %%   Unicode char for ~.
>>> \catcode`^^^^00a0=\active
>>> \def^^^^00a0{\nobreakspace}
>
>
> Are the definitions necessary? That means how does XeTex handle
> normally e.g. U+00A0 NO-BREAK SPACE?  Can  there be a line break
> before or after this input?

XeTeX has no special built-in knowledge about U+00A0 or the various
other Unicode space-like characters; it will simply "print" them in
the current font. Which would be fine, except that some fonts fail to
support them, in which case you'll get a .notdef glyph. :(

Defining these in a font-independent way using TeX seems like a good
idea in general; however, care may be needed to make them work
correctly in all contexts, particularly when they occur in text that
ends up going to the LaTeX .aux file, etc., or into PDF bookmarks. I
haven't really looked into this, not being a serious LaTeX user, just
wondering......

JK