[XeTeX]   in XeTeX

Tobias Schoel liesdiedatei at googlemail.com
Sun Nov 13 21:09:16 CET 2011


Now, that the practicability is cleared, let's come back to the 
philosophical part:

Should &nbsp=u00a0 be active and treated as ~ by default? Just like 
u202f and u2009 should be active and treated as \, and \,\hspace{0pt}?

Where would such a default take place:
- XeTeX engine
- XeLaTeX format
- some package (xunicode, fontspec, some new package)
- my own package/preamble template

As was discussed in the Thread "Space characters and whitespace", using 
these characters without any treatment contradicts TeX's spacing 
algorithms. So it seems, one should not use these characters and blame 
unicode OR treat these characters specially.

bye

Toscho

Am 13.11.2011 21:36, schrieb Mike Maxwell:
> On 11/13/2011 11:09 AM, Tobias Schoel wrote:
>> How much text flow control mechanism should be done by none-ASCII
>> characters? Unicode has different codepoints for signs with the same
>> meaning but different text flow control (space vs. non-break space). So
>> text flow could be controled via Unicode codepoints. But should it? Or
>> should text flow be controled via commands and active characters?
>>
>> One opinion says, that using (La)TeX is programming. Consequently, each
>> character used should be visually well distinguishable. This is not the
>> case with all the Unicode white space characters.
>>
>> One opinion says, that using (La)TeX is transforming plain text (like
>> .txt) in well formatted text. Consequently, the plain text may contain
>> as much (meta)-information as possible and these information should be
>> used when transforming it to well formatted text. So Unicode white space
>> characters are allowed and should be valued by their specific meaning.
>
> And on the third hand, XeTeX could allow both.
>
>  > How would you visually differentiate between all
>  > the white space characters (space vs. non-break space, thin space
>  > (u2009) vs. narrow no-break space (u202f), … ) such that the text
>  > remains readable?
>
> Of course, there's precedent for this kind of problem: tab characters.
> For that matter, many text editors display Unicode combining diacritics
> over or under the base character that they go with, which is already
> getting away from a straightforward display of the underlying characters.
>
> At any rate, there are lots of ways non-ASCII space characters could be
> distinguished; Philip Taylor mentions color coding, which is certainly
> possible. Another would be to display some kind of code for non-ASCII
> spaces. There's one font which displays all characters as nothing but
> their Unicode code points (in hex) inside some kind of box. A tex(t)
> editor could certainly be programmed to display control characters
> (which these space characters essentially are) differently from the
> "regular" characters (which would continue to be displayed with an
> ordinary font).
>
> The editor I use, jEdit, provides yet another option: a command
> (bindable to a keystroke) that tells me the Unicode code point of any
> character, on the editor's status line.


More information about the XeTeX mailing list