[XeTeX] line spacing

Jonathan Kew jonathan_kew at sil.org
Wed Mar 9 12:09:44 CET 2005


On 9 Mar 2005, at 2:30 am, Ross Moore wrote:

> On 08/03/2005, at 10:06 AM, Jonathan Kew wrote:
>
>> On 7 Mar 2005, at 9:59 pm, Ross Moore wrote:
>>
>>> That should result in a fully backwards-compatible
>>> definition of ^^^^00a0 that will do the right thing
>>> for auxiliary files, TofCs, indexes, etc.
>>> (It also will ensure that ~ has this latter property too!)
>>
>> I don't see any compelling reason to map either ~ or ^^^^00a0 to  
>> \char32; it would be simpler, and just as effective, to let both of  
>> these be active characters that expand to {\penalty10000\ } or  
>> equivalent.
>
> The original poster wanted to specifically insert \char32 into the PDF
> when ^^^^00A0 is encountered.

As I understood it, the OP's concern was that the Unicode character  
U+00A0 in a source text should function as a non-breaking space, rather  
than specifically that \char32 should end up in the PDF. The use of  
\char32 as the expansion of an \active ^^^^00a0 was just one of the  
ways I suggested (see messages of Sept 4th) that this could be done.

> You are right that TeX normally does not include space characters.
> Perhaps the intention is to be able to make some phrases
> findable in the PDF using the "Find" command.
> This may be achievable with the right definitions of a non-breaking  
> space,
> using only font characters.

I think you'll find that XeTeX-generated PDFs using Unicode fonts are  
generally pretty good as far as searchability is concerned--probably  
less likely to be troublesome than other TeX output.

>> That's the TeX way to do a non-breaking space.
>>
>> So (noting that ^^a0 is a perfectly good short form for ^^^^00a0),  
>> how about simply:
>> 	\catcode`\^^a0=\active
>> 	\let^^a0=~
>> (Maybe that's not enough to fit in properly with The LaTeX Way, but  
>> you get the idea.)
>
> Sure; this is enough to make the two characters work equivalently,
> unless packages alter the definition of ~ in certain contexts.
> For example, in URLs the ~ becomes \textasciitilde .

But that wouldn't change the behavior of ^^a0 once it's been defined  
this way, would it? Which is as it should be.

> This expansion is independent of font-encoding.
> That's what I think should be changed.
> When XeTeX uses a 'U'-encoded font, then the \char32
> can be inserted instead, or perhaps even the ^^a0
> character itself.

I still think this sounds like unnecessary extra work, but....

> I'm not sure how this would affect the TeX processing.
> Is the \nobreak (or \penalty -10000) still needed
> to inhibit breaking and/or hyphenation at this place ?

....if you do end up inserting \char32, and if the user has activated  
\XeTeXlinebreaklocale, this could probably become a line-break  
position. \char160 (or ^^a0) wouldn't be, but OTOH there's the risk  
that a font might not support it (so you'd get a .notdef glyph  
instead)--which was why the issue arose in the first place. If the  
fonts all supported U+00A0, it could simply be left with \catcode 12,  
and typeset like any other punctuation.

> Ultimately TeX uses {\ } to get the width of a space.
> But that's a primitive command which doesn't actually
> insert a space at all, right?

Right, it inserts glue. TeX doesn't use a "space character" at all.

> I don't want to hack at '\ ' as this might break a lot
> of things.

Definitely.

>  It *is* legitimate to hack at \nobreakspace
> to give different expansions according to font-encoding.

Legitimate but superfluous, IMO.

> But maybe that's not advisable either; in which case
> ^^a0 needs a separate expansion, which defaults to be
> the same as ~ when the font-encoding is not 'U'.

Or, as I'd propose, ^^a0 is simply a synonym for ~ (but doesn't get  
redefined when something like \url changes ~).

>> (Yes, there are some circumstances where the end result would be  
>> different; but they're obscure enough that I think you could ignore  
>> them. People who really care about them had better know what they're  
>> doing, and can make their own definitions!)
>
> I'm not convinced that it is so obscure.
> I've heard complaints that PDFs from LaTeX are not properly
> searchable (due to the prevalence of \kern s and glue ?).

As mentioned above, I'd expect PDFs from XeLaTeX, at least with AAT  
fonts, to work pretty well in this regard (would be interested to hear  
people's experiences, although there's probably little I can do to  
affect it). Documents using OpenType fonts may be a bit more  
troublesome.

-- JK

> Such a mechanism using non-breaking spaces could help,
> at least for particular important word groups.
>
>
>
> I'd appreciate further comments on this issue, before
> trying to implement something that may be useless or
> even destructive.
>
>
> Cheers,
>
> 	Ross
>
>
>
>
>>
>> Just my thoughts on it .... JK
>>
>> _______________________________________________
>> XeTeX mailing list
>> postmaster at tug.org
>> http://tug.org/mailman/listinfo/xetex
>>
> ----------------------------------------------------------------------- 
> -
> Ross Moore                                         ross at maths.mq.edu.au
> Mathematics Department                             office: E7A-419
> Macquarie University                               tel: +61 +2 9850  
> 8955
> Sydney, Australia                                  fax: +61 +2 9850  
> 8114
> ----------------------------------------------------------------------- 
> -
>
> _______________________________________________
> XeTeX mailing list
> postmaster at tug.org
> http://tug.org/mailman/listinfo/xetex
>



More information about the XeTeX mailing list