[XeTeX] xunicode.sty -- pinyin and TIPA shortcuts
Robert Spence
spence at saar.de
Tue Apr 4 05:54:26 CEST 2006
Dear XeTeXnicians,
These are two "sorcerer's apprentice" type questions; I've only been
using XeTeX for a few days (---but what a wonderful few days they've
been!---), and I do hope I'm not guilty here of "rushing in where
angels fear to tread"...
QUESTION 1) Pinyin Keyboarding Shortcuts (cf. the thread "[XeTeX]
Chinese: vertical typesetting, pinyin tones, and Japanese macrons?"
from mid-July 2005)
Might it perhaps be an idea in a future version of xunicode.sty to
change line 1131 of version 0.5 [2005/02/26] from
\DeclareUTFcomposite[\UTFencname]{x01DA}{\v}{\"u}
to
\DeclareUTFcomposite[\UTFencname]{x01DA}{\v}{v}
?
REASONING:
a) I find the easiest way of getting tone marks for pinyin is to use
old-style keyboarding shortcuts like \=a \'a \v{a} \`a with an
appropriate local font setting, e.g.
\newfontinstance\pinyin{STHeiti}%gives a nicer lowercase g than
STKaiti
\pinyin
(and I guess the line
\defaultfontfeatures{Mapping=tex-text}
in the preamble is also pulling its weight here...).
b) I can't seem to access LATIN SMALL LETTER U WITH DIAERESIS AND
CARON (Unicode 01DA) via the current shortcut \v{\"u} (which puts the
caron _after_ the dieresised u), and am too lazy to type
\textdieresiscaron{u} each time, but I discovered that if I put
\DeclareUTFcomposite[\UTFencname]{x01DA}{\v}{v}
in the preamble (or even in the body) of my document then I can use
the shortcut \v{v} without any problems. (Am I missing something
with \v{\"u}? I tried just about every other variation I could think
of, but to no avail.)
c) The Chinese themselves routinely use a lowercase v to stand for a
lowercase u with dieresis, for example in internet addresses; it
makes sense, because v is the only letter of the Roman alphabet they
don't need, and they only have one sound for which they don't have a
simple Roman letter without (non-tone) diacritic available; the fact
that v comes directly after u in the Roman alphabet is an added
bonus, because the letter you're trying to typeset is conceptually
like "u, _plus_ something"; and in any case, it was only very
recently (by Chinese time-measuring standards) that u and v stopped
being just variant forms of the same letter...
d) v is used for u-with-dieresis in the macros in Werner Lemberg's
pinyin.sty (/usr/local/teTeX/share/texmf.local/tex/latex/cjk/texinput/
pinyin.sty), allowing e.g. \nv3 to be used to get n plus u-with-
dieresis-and-caron in LaTeX --- although whether the result is
acceptable depends on the font you're using; as Werner Lemberg says
at lines 223--224 of pinyin.sty [ Version 4.6.0 (11-Aug-2005) ]:
% the previous definitions are almost trivial. The only tricky
macro is the
% following one.
QUESTION 2) TIPA Keyboarding Shortcuts (I think this relates to a
thread initiated by Ross Moore in late July 2004:
[XeTeX] New feature request for XeTeX
where there was a discussion about "active characters versus encoding
mappings", which I only vaguely understand the implications of):
Is it acceptable or advisable, as a temporary workaround to a problem
I encountered, to make the following changes to a working copy of
xunicode.sty (version 0.5 [2005/02/26]) in my "home" texmf tree, in
order to get all the TIPA shortcuts working? The following lines
appeared in Terminal when I ran the "diff" command (which I confess
to never having used before in my life until a few moments ago!) on
the original file and my altered version of it:
702,703c702,703
< \def 2{\textezh}%
< \def 3{\textvarepsilon}%
---
> \def 2{\textturnv}%
> \def 3{\textrevepsilon}%NOT VAR
1374c1374
< %\DeclareUTFcharacter[\UTFencname]{x028A}{\textscupsilon} % TIPA-U
---
> \DeclareUTFcharacter[\UTFencname]{x028A}{\textscupsilon} % TIPA-U
1391c1391
< \DeclareUTFcharacter[\UTFencname]{x0292}{\ezh} % TIPA-Z
---
> \DeclareUTFcharacter[\UTFencname]{x0292}{\textezh} % TIPA-Z
REASONING:
With these changes in place I can (I think) access all the TIPA
characters I need, in the argument of a \textipa{...} command[*but
see footnote], via the "active characters" strategy, using the old-
fashioned keyboarding habits that phoneticians are used to having to
resort to when sending emails. I was a bit worried about the fact
that the uppercase U as an active character had been commented out at
line 1374 --- I thought it might have been done to avoid a nasty
clash in some potential situation where U was already active and was
being used for some other (more important) purpose.
*footnote:
One thing I've come to appreciate over the past few days of XeTeXing
is that doing something like {\anyoldcommand ...} is often more
robust than doing the corresponding \textanyoldcommand{...}. I
discovered this while I was trying to find a way of typesetting
phonetic transcriptions in colo(u)r---for teaching purposes, as
otherwise the phonetic symbols seemed to blend in just a bit too
seamlessly with the surrounding text when using the beautiful Gentium
font. I had defined a colour called WSPRgreen (are those Will
Robertson's initials, by any chance?), then found I had to rename it
to wsprgreen if I wanted to use it _inside_ the argument of a \textipa
{...} command, because all uppercase characters were active! And
when I tried
\textcolor{WSPRgreen}{\textipa{...}}
it (of course?) caused me to lose the phonetic encoding, so I had to
learn to write
{\color{WSPRgreen} \textipa{...}}
instead. All highly educational. Just a thought: on page 13 of
tipaman.pdf, Fukui Rei defines three ways of telling LaTeX that you
want phonetics:
\textipa{...}
{\tipaencoding ...}
\begin{IPA} ... \end{IPA}
Of these, only the first is implemented in xunicode.sty; would there
be any point in trying to implement either the "argumentlesss-command-
within-a-group" or the "environment" solution, instead of or in
addition to the "command-with-argument" one? (Way out of my depth
here, but FWIW...)
I hope the kind of old-fashioned keyboarding habits underlying both
of my questions aren't too much of an annoyance to the project
developers. In the short time since I started using XeTeX I've
realized that it's better to avoid anything that even remotely
involves the fontenc, inputenc, and babel packages, and just type
into your document the unicode characters you want to typeset,
changing the keyboard layout as necessary and using the Keyboard
Viewer to help train new keyboarding habits. So far I've found I can
do this well enough for switching between English, German, French,
Russian, Hebrew, and Greek (although the LGR shortcuts described in
9.4.2 of The LaTeX Companion, 2nd ed., were nicer), and for Chinese
characters it's fairly easy to use the ITABC input method, but there
doesn't seem to be a phonetics keyboard available with Mac OS X
10.4.5 (and in any case, the solution would probably need to be more
like one of the Chinese Input Methods, where pressing one or more
keys gets you a list of relevant characters and you select the one
you want).
My sincere thanks to everyone involved in the XeTeX project. I
really appreciate what you're doing, and will try to contribute in
whatever ways I can. Please bear with me until I have a bit more of
a grasp of it all! (BTW: the ability that fontspec.sty gives you to
play around with so many---usually fairly tasteless---combinations of
all those beautiful OSX system fonts makes me appreciate what an
excellent job GTA did in selecting the default combinations for
gtamacfonts, despite what one may or may not think about the best
weight for Gill Sans with Hoefler Text in koma-script-style section
headings... ;-)
-- Robert Spence
Applied Linguistics
Saarland University
Germany
More information about the XeTeX
mailing list