[XeTeX]   in XeTeX

Mike Maxwell maxwell at umiacs.umd.edu
Sun Nov 13 20:36:59 CET 2011

On 11/13/2011 11:09 AM, Tobias Schoel wrote:
> How much text flow control mechanism should be done by none-ASCII
> characters? Unicode has different codepoints for signs with the same
> meaning but different text flow control (space vs. non-break space). So
> text flow could be controled via Unicode codepoints. But should it? Or
> should text flow be controled via commands and active characters?
> One opinion says, that using (La)TeX is programming. Consequently, each
> character used should be visually well distinguishable. This is not the
> case with all the Unicode white space characters.
> One opinion says, that using (La)TeX is transforming plain text (like
> .txt) in well formatted text. Consequently, the plain text may contain
> as much (meta)-information as possible and these information should be
> used when transforming it to well formatted text. So Unicode white space
> characters are allowed and should be valued by their specific meaning.

And on the third hand, XeTeX could allow both.

 > How would you visually differentiate between all
 > the white space characters (space vs. non-break space, thin space
 > (u2009) vs. narrow no-break space (u202f), … ) such that the text
 > remains readable?

Of course, there's precedent for this kind of problem: tab characters. 
For that matter, many text editors display Unicode combining diacritics 
over or under the base character that they go with, which is already 
getting away from a straightforward display of the underlying characters.

At any rate, there are lots of ways non-ASCII space characters could be 
distinguished; Philip Taylor mentions color coding, which is certainly 
possible.  Another would be to display some kind of code for non-ASCII 
spaces.  There's one font which displays all characters as nothing but 
their Unicode code points (in hex) inside some kind of box.  A tex(t) 
editor could certainly be programmed to display control characters 
(which these space characters essentially are) differently from the 
"regular" characters (which would continue to be displayed with an 
ordinary font).

The editor I use, jEdit, provides yet another option: a command 
(bindable to a keystroke) that tells me the Unicode code point of any 
character, on the editor's status line.
	Mike Maxwell
	maxwell at umiacs.umd.edu
	"My definition of an interesting universe is
	one that has the capacity to study itself."
         --Stephen Eastmond

More information about the XeTeX mailing list