[XeTeX] Whitespace in input

Ross Moore ross.moore at mq.edu.au
Tue Nov 15 21:43:00 CET 2011

On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:

> Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way.

No, I disagree completely.

What if you really want the Ux00A0 character to be in the PDF?
That is, when you copy/paste from the PDF, you want that character
to come along for the ride.

In TeX ~ *simulates* a non-breaking space visually, but there is
no actual character inserted.
If you want the character you have to ensure that it gets there,
and what more natural way is there than to put it in explicitly.

This is how XeTeX treats it currently, according to my experiments,
using just  fontspec  and  "Charis SIL" font.
Anyone who has a different experience should check what other
packages and fonts are being loaded, and whether there is something
that specifically changes how that character is handled.

> The big puzzle will happen when someone, not using an editor capable of displaying invisibles, can't understand why they can't get XeTeX to break between the two words.

That is an editor problem, not one that XeTeX itself should be
concerned with.

Now having Ux00A0 between two words may change the way 
hyphenation works for those words.

But surely if you are wanting to inhibit a line-break
between words, you probably also don't want either word to
be hyphenated. So this could really be the correct thing.

> Now, have I got the ideas being discussed correct?
> Good Luck,
> Herb Schulz
> (herbs at wideopenwest dot com)

Hope this helps,


