[XeTeX] Add Tibe-tan typesetting ability
Jonathan Kew
jonathan_kew at sil.org
Mon Nov 26 13:53:43 CET 2007
On 26 Nov 2007, at 6:59 am, <sonamm at sohu.com> wrote:
> Hi, there:
>
> I am trying use XeTeX to typesetting Tibe-tan scripts. I've written
> a preprocessor to insert glues into Tibe-tan TeX scripts, which use
> the encoding like GB/GBK HZ characters(That is what used most in
> mainland China Tibe-tan typesetting, they have all the vowels and
> letters stacked in advance, while in Unicode, all are seperated and
> stack on the fly).
So I guess all the combinations of stacked consonants, vowels, etc.,
are represented using Private Use codes, or something like that? Yes,
this should work, though it's not the standard Unicode representation
of Tibetan, so the data will not be interoperable with Unicode-based
systems.
For inserting glue, you might be able to use TeX macro programming or
the new (XeTeX 0.997) inter-character token insertion feature, in the
same way as the jspacing and zhspacing packages for Japanese and
Chinese. This could remove the need for a preprocessor, which would
simplify your workflow.
> The preprocessor+XeTeX can typesetting Tibe-tan scripts very well
> except one Tibe-tan punctuation, which should be changed to a
> diffrent punctuation if it is after the first word in a line.
I'm curious about this; I wasn't aware of this feature of Tibetan.
Could you give details of the character involved, and perhaps even
examples of the proper appearance in different contexts?
> My question is:
>
> 1. should I add a locale and LayoutEngine to ICU or should I change
> texk/web2c/xetexdir/XeTeXLayoutEngine.* to add this ability to XeTeX?
I'm not sure what the best approach might be, at least until I have a
clearer understanding of the problem. If there is a need to treat the
first word in a line specially, I don't think this can be done just
at the layout-engine level, because the layout engine deals with each
word in isolation, and is not aware of its position on the line.
> 2. how is the XeTeX's support to the Unicode way of typesetting the
> complex scripts, in which all the relevent parts of a charater is
> stacked on the fly?
This should work using either AAT fonts (on Mac OS X only) or
OpenType fonts that support the Tibetan script, OpenType tag =
'tibt'. (Or it would be possible to use the Graphite rendering
technology, but I am not aware of any available Tibetan fonts using
Graphite.)
In a quick test using an OpenType font, it seemed to be necessary to
specify some additional features, otherwise vowels didn't stack
properly:
\font\tibfont="Tibetan Machine Uni:script=tibt;+abvs;+blws" at 11pt
\tibfont བོད་ཀྱི་རྒྱལ་ཁམས་
ན། ...etc...
This may mean that the Tibetan shaping engine currently in ICU
doesn't fully support the Tibetan OpenType specification. Or it might
mean that the font doesn't fully conform. But I have not looked into
the issue in detail.
JK
More information about the XeTeX
mailing list