[XeTeX] xunicode.sty -- pinyin and TIPA shortcuts
Robert Spence
spence at saar.de
Thu Apr 6 23:03:25 CEST 2006
Dear Ross,
On 05 Apr 2006, at 01:36 , Ross Moore wrote:
> It's quite possible that xunicode.sty has some errors, where
> I didn't make the best choice. It's up to the experts in appropriate
> fields to inform me (or Jonathan) of any such mistakes.
There might be a couple of mistakes in the implementation of the TIPA
commands, but I haven't checked them all systematically yet, only the
ones I need for doing basic transcriptions for English. I haven't
used TIPA a great deal since I stopped teaching phonetics classes
regularly, but if I can make some time I'll go through everything in
the tipa.sty package and try to get a better overview of what's
currently implemented in xunicode.sty, and what might perhaps have to
be done in an additional separate package, if it can be done easily
at all. The user base is probably still a bit small, but I assume
that it won't be too much longer before someone does something
similar to (and hopefully compatible with) XeTeX on another platform.
> (Jonathan Kew:)
>> > It might make good sense to have a little "unicode-pinyin" package
>> > that gives you more convenient ways to access characters used in
>> > Pinyin transcription, such as using v as a shorthand for u-
>> dieresis.
>> > It just doesn't belong in xunicode.sty, IMO.
>>
>> Point taken. I was thinking of writing something like that, a
>> kind of patch to run after loading xunicode.sty---just hadn't got
>> around to it, and still don't understand all the implications of
>> what's in xunicode.sty anyway.
>
> Yes. Any variation that reflects a particular usage that
> need not be 'Universal' should be done by loading a package
> *after* xunicode.sty itself.
>
> By all means, use the same commands that xunicode.sty uses
> to declare the associations between macros and Unicode points.
>
> Examples of the main commands are:
>
> \DeclareUTFcharacter[\UTFencname]{x01CC}{\nj}
> \DeclareUTFcomposite[\UTFencname]{x01CD}{\v}{A}
> \DeclareEncodedCompositeCharacter{\UTFencname}{\u}{0306}{02D8} %
> Combining breve
> \DeclareEncodedCompositeAccents{\UTFencname}{\texthookcircum}{0309}
> {0302}
>
> Note that these commands setup associations that depend on the
> encoding,
> via the [\UTFencname] optional parameter.
>
>
> There is also \UndeclareUTFcharacter and \UndeclareUTFcomposite
> which
> can be used to cancel declarations when the code-points are not
> supported
> in the font being used.
>
> e.g.
> \UndeclareUTFcomposite[Pin]{x01DA}{\v}{\"u}
>
> would allow u\char"0308\char"030C to be used instead of
> \char"01DA ,
> when using encoding Pin .
>
>
> More generally,
> \UndeclareUTFcharacter[Pin]{x01CC}{\nj}
>
> would mean that (Xe)LaTeX might throw up a warning message when \nj
> is used with encoding 'Pin' , rather than inserting \char"01CC
> assuming (wrongly) that there is support in the font for it.
> Without the \Undeclare... the only way to know that there's a
> possible
> problem is to notice characters missing from the final PDF output.
>
>
> Implicit here is that, if you are going to make non-universal
> declarations,
> then it's a good idea to:
> 1. *change the encoding* first
> 2. load the xunicode.sty package
> 3. make your changes in the encoding
>
> e.g.
> \newcommand{\UTFencname}{Pin}
> \usepackage{xunicode}
> \usepackage{unicode-pinyin}
Ah, now I understand! When I first saw the lines in xunicode.sty
about changing the encoding I didn't have enough background knowledge
to interpret them properly.
This could be a real Pandora's Box... you could end up with hundreds
of special, local encodings (like the Pin encoding for unicode-pinyin
you suggest). Although I probably already have the skills I'd need
to reimplement Werner Lemberg's macros from pinyin.sty for unicode
fonts---and it would be a lot easier with xunicode.sty syntax than
the way he had to do it---I think it would be better to get some
feedback from sinologists first about their workflows, and maybe
check with the Chinese ministry of education (or whoever is
ultimately responsible) about the official standards.
I'm pretty sure the Chinese government prefers the upright italic
shape for lowercase a and g in pinyin, but I'm not sure how strict
this is. There's an English-to-Chinese dictionary from Singapore
printed in a font that doesn't have upright italic glyph shapes, and
the publishers were obviously so against the idea of using the normal
roman a in pinyin transcriptions that they decided to use the _real_
italic a, in the middle of text set in roman! A typographic
nightmare! On the other hand, they don't make a fuss about the g at
all, and just use the roman (pair-of-spectacles-rotated-ninety-
degrees) shape. It may not be so much of an issue any more now that
the PRC has decided to go with simplified characters rather than
abolishing them in favour of Pinyin romanization, but given the sheer
size of the speech community involved...
Thanks for your time,
-- Rob Spence
More information about the XeTeX
mailing list