[XeTeX] Whitespace in input

Ross Moore ross.moore at mq.edu.au
Wed Nov 16 00:53:49 CET 2011

On 16/11/2011, at 9:45 AM, Zdenek Wagner wrote:

> 2011/11/15 Ross Moore <ross.moore at mq.edu.au>:

>>>> What if you really want the Ux00A0 character to be in the PDF?
>>>> That is, when you copy/paste from the PDF, you want that character
>>>> to come along for the ride.
>>>> From the typographical point of view it is the worst of all possible
>>> methods. If you really wish it,

Maybe you misunderstood what I meant here.

I'm not saying that you might want Ux00A0 for *every* place
where there is a word-breaking space.
Just that there may be individual instance(s) where you have
a reason to want it.

Just like any other Unicode character, if you want it then
you should be able to put it in there.
That's what XeTeX currently does (with the TeX-wise familiar 
ASCII exceptions) for any code-point supported by the
chosen font.

>> The *really wish it* is the choice of the author, not the
>> software.
>>> then do not use TeX but M$ Word or
>>> OpenOffice. M$ Word automatically inserts nonbreakable spaces at some
>>> points in the text written in Czech. As far as grammer is concerned,
>>> it is correct. However, U+00a0 is fixed width. If you look at the
>>> output, the nonbreakable spaces are too wide on some lines and too
>>> thin on other lines. I cannot imagine anything uglier.
>> I do not disagree with you that this could be ugly.
>> But that is not the point.
>> If you want superior aesthetic typesetting, with nice choices
>> for hyphenation, then don't use Ux00A0. Of course!
>> Whatever the reason for wanting to use this character, there
>> should be a straight-forward way to do it.
>> Using the character itself is:
>>  a.  the most understandable
>>  b.  currently works
>>  c.  requires no special explanation.
> These are reasons why people might wish it in the source files, not in PDF.

Yes. In the source, to have the occasional such character included
within the PDF, for whatever reason appropriate to the material
being typeset -- whether verbatim, or not.

> If you wish to take a [part of] PDF and include it in another PDF as
> is, you can take the PDF directly without the need of grabbing the
> text. If you are interested in the text that will be retypeset, you
> have to verify a lot of other things.

How is any of this relevant to the current discussion?

> If the text contained hyphenated
> words, you have to join the parts manually. You will have a lot of
> other work and the time saved by U+00a0 will be negligible. There are
> tools that may help you to insert nonbreakable spaces. I have even my
> own special tools written in perl to handle one class of input files
> that are really plain texts and the result is (almost) correctly
> marked LaTeX source.

All well and good. 
But how is that relevant to anything I said?

>>> --
>>> Zdeněk Wagner
>>> http://hroch486.icpf.cas.cz/wagner/
>>> http://icebearsoft.euweb.cz



Ross Moore                                       ross.moore at mq.edu.au 
Mathematics Department                           office: E7A-419      
Macquarie University                             tel: +61 (0)2 9850 8955
Sydney, Australia  2109                          fax: +61 (0)2 9850 8114

More information about the XeTeX mailing list