[XeTeX] Whitespace in input

Herbert Schulz herbs at wideopenwest.com
Tue Nov 15 22:12:45 CET 2011

On Nov 15, 2011, at 2:43 PM, Ross Moore wrote:

> On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
>> Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way.
> No, I disagree completely.
> What if you really want the Ux00A0 character to be in the PDF?
> That is, when you copy/paste from the PDF, you want that character
> to come along for the ride.
> In TeX ~ *simulates* a non-breaking space visually, but there is
> no actual character inserted.
> If you want the character you have to ensure that it gets there,
> and what more natural way is there than to put it in explicitly.
> This is how XeTeX treats it currently, according to my experiments,
> using just  fontspec  and  "Charis SIL" font.
> Anyone who has a different experience should check what other
> packages and fonts are being loaded, and whether there is something
> that specifically changes how that character is handled.


But isn't that also true about a regular space character? Doesn't (Xe)TeX insert some glue rather than a Space Character?

>> The big puzzle will happen when someone, not using an editor capable of displaying invisibles, can't understand why they can't get XeTeX to break between the two words.
> That is an editor problem, not one that XeTeX itself should be
> concerned with.

Agreed. But I'll be you end up with lots of questions on ctt/texhax/etc. about line breaking; assuming that the non-breaking space actually does it's ``job.''

> Now having Ux00A0 between two words may change the way 
> hyphenation works for those words.
> But surely if you are wanting to inhibit a line-break
> between words, you probably also don't want either word to
> be hyphenated. So this could really be the correct thing.

or not. :-)

Good Luck,

Herb Schulz
(herbs at wideopenwest dot com)

More information about the XeTeX mailing list