[XeTeX] Add Tibe-tan typesetting ability

Jonathan Kew jonathan_kew at sil.org
Mon Nov 26 13:53:43 CET 2007


On 26 Nov 2007, at 6:59 am, <sonamm at sohu.com> wrote:

> Hi, there:
>
> I am trying use XeTeX to typesetting Tibe-tan scripts. I've written  
> a preprocessor to insert glues into Tibe-tan TeX scripts, which use  
> the encoding like GB/GBK HZ characters(That is what used most in  
> mainland China Tibe-tan typesetting, they have all the vowels and  
> letters stacked in advance, while in Unicode, all are seperated and  
> stack on the fly).
So I guess all the combinations of stacked consonants, vowels, etc.,  
are represented using Private Use codes, or something like that? Yes,  
this should work, though it's not the standard Unicode representation  
of Tibetan, so the data will not be interoperable with Unicode-based  
systems.

For inserting glue, you might be able to use TeX macro programming or  
the new (XeTeX 0.997) inter-character token insertion feature, in the  
same way as the jspacing and zhspacing packages for Japanese and  
Chinese. This could remove the need for a preprocessor, which would  
simplify your workflow.
> The preprocessor+XeTeX can typesetting Tibe-tan scripts very well  
> except one Tibe-tan punctuation, which should be changed to a  
> diffrent punctuation if it is after the first word in a line.
I'm curious about this; I wasn't aware of this feature of Tibetan.  
Could you give details of the character involved, and perhaps even  
examples of the proper appearance in different contexts?

> My question is:
>
> 1. should I add a locale and LayoutEngine to ICU or should I change  
> texk/web2c/xetexdir/XeTeXLayoutEngine.* to add this ability to XeTeX?
I'm not sure what the best approach might be, at least until I have a  
clearer understanding of the problem. If there is a need to treat the  
first word in a line specially, I don't think this can be done just  
at the layout-engine level, because the layout engine deals with each  
word in isolation, and is not aware of its position on the line.

> 2. how is the XeTeX's support to the Unicode way of typesetting the  
> complex scripts, in which all the relevent parts of a charater is  
> stacked on the fly?
This should work using either AAT fonts (on Mac OS X only) or  
OpenType fonts that support the Tibetan script, OpenType tag =  
'tibt'. (Or it would be possible to use the Graphite rendering  
technology, but I am not aware of any available Tibetan fonts using  
Graphite.)

In a quick test using an OpenType font, it seemed to be necessary to  
specify some additional features, otherwise vowels didn't stack  
properly:

     \font\tibfont="Tibetan Machine Uni:script=tibt;+abvs;+blws" at 11pt
     \tibfont བོད་ཀྱི་རྒྱལ་ཁམས་ 
ན།   ...etc...

This may mean that the Tibetan shaping engine currently in ICU  
doesn't fully support the Tibetan OpenType specification. Or it might  
mean that the font doesn't fully conform. But I have not looked into  
the issue in detail.

JK





More information about the XeTeX mailing list