[XeTeX] default char classes

Jonathan Kew jonathan_kew at sil.org
Wed Mar 12 19:10:25 CET 2008


On 12 Mar 2008, at 1:31 pm, Barry MacKichan wrote:

> Jonathan, you have convinced me that language markup is needed.

:-)

There are, of course, simple cases where it's possible to get away  
without it, and cases where "magic" font-switching would be handy for  
specific purposes. But it's very hard to design a universal, robust  
system.

> I am curious about Will's question. Are there efficiency concerns in
> defining lots of large token classes?

The main concern I'd have is that I suspect that in most cases, users  
of character class and inter-char tokens will really only be  
interested in a couple of scripts, and certain classes of characters  
within those scripts (e.g., opening and closing punctuation). So it's  
simplest for them if they define the specific classes that matter for  
their application, and leave everything else in a default "other" class.

If we pre-assign all the Unicode characters to several dozen (at  
least) classes, based on script and on other character categories --  
in fact, we might easily hit 100 classes or more -- then packages  
like zhspacing that care about a certain script, and consider  
everything else "other", will have a lot of extra class-pairs to  
consider, for no obvious benefit. That seems like an extra burden on  
users/macro writers.

What we probably should do, as part of the xetex and xelatex formats,  
is create a \newcharclass allocator (like plain TeX's \newcount,  
etc), to help people manage class numbers without conflict.

If someone does want to try and implement comprehensive multi-script  
automatic font switching (despite my reservations!), there's nothing  
to stop them assigning all the Unicode chars to classes based on  
script, and even precompiling this into a format file. (The unicode- 
letters.tex file, and the Perl script that generates it -- found in  
the xetex source tree -- could give some ideas how to go about this.)

JK



More information about the XeTeX mailing list