[XeTeX] Whitespace in input

Zdenek Wagner zdenek.wagner at gmail.com
Tue Nov 15 23:45:57 CET 2011


2011/11/15 Ross Moore <ross.moore at mq.edu.au>:
> Hi Zdenek,
>
> On 16/11/2011, at 8:58 AM, Zdenek Wagner wrote:
>
>> 2011/11/15 Ross Moore <ross.moore at mq.edu.au>:
>>>
>>> On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
>>>
>>>> Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way.
>>>
>>> No, I disagree completely.
>>>
>>> What if you really want the Ux00A0 character to be in the PDF?
>>> That is, when you copy/paste from the PDF, you want that character
>>> to come along for the ride.
>>>
>>> From the typographical point of view it is the worst of all possible
>> methods. If you really wish it,
>
> The *really wish it* is the choice of the author, not the
> software.
>
>> then do not use TeX but M$ Word or
>> OpenOffice. M$ Word automatically inserts nonbreakable spaces at some
>> points in the text written in Czech. As far as grammer is concerned,
>> it is correct. However, U+00a0 is fixed width. If you look at the
>> output, the nonbreakable spaces are too wide on some lines and too
>> thin on other lines. I cannot imagine anything uglier.
>
> I do not disagree with you that this could be ugly.
> But that is not the point.
>
> If you want superior aesthetic typesetting, with nice choices
> for hyphenation, then don't use Ux00A0. Of course!
>
>
> Whatever the reason for wanting to use this character, there
> should be a straight-forward way to do it.
> Using the character itself is:
>  a.  the most understandable
>  b.  currently works
>  c.  requires no special explanation.
>
These are reasons why people might wish it in the source files, not in PDF.

If you wish to take a [part of] PDF and include it in another PDF as
is, you can take the PDF directly without the need of grabbing the
text. If you are interested in the text that will be retypeset, you
have to verify a lot of other things. If the text contained hyphenated
words, you have to join the parts manually. You will have a lot of
other work and the time saved by U+00a0 will be negligible. There are
tools that may help you to insert nonbreakable spaces. I have even my
own special tools written in perl to handle one class of input files
that are really plain texts and the result is (almost) correctly
marked LaTeX source.
>
>>
>>
>> --
>> Zdeněk Wagner
>> http://hroch486.icpf.cas.cz/wagner/
>> http://icebearsoft.euweb.cz
>
> Cheers,
>
>        Ross
>
> ------------------------------------------------------------------------
> Ross Moore                                       ross.moore at mq.edu.au
> Mathematics Department                           office: E7A-419
> Macquarie University                             tel: +61 (0)2 9850 8955
> Sydney, Australia  2109                          fax: +61 (0)2 9850 8114
> ------------------------------------------------------------------------
>
>
>
>
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



More information about the XeTeX mailing list