[XeTeX] xunicode.sty -- pinyin and TIPA shortcuts

Ross Moore ross at ics.mq.edu.au
Fri Apr 7 07:51:19 CEST 2006

Hi Will,

On 07/04/2006, at 1:14 PM, Will Robertson wrote:

> You're a bit unclear whether there is such a letter as "g that  
> looks like spectacles" vs. "g like you (or I, at least) write". If  
> they are separate characters, and have specific unicode characters,  
> then you'd be best entering in the correct glyphs in the source.
> If this is just a crappy font support thing (as in, you want to use  
> so-and-so font for typesetting phonetics but its "g" looks  
> incorrect for this usage), its definitely like a problem that must  
> be solved by the font vendors. Ideally in this case there would be  
> a font feature, activated by fontspec as something like  
> "\addfontfeature{style=tipag}".

Yep. I agree with this.
However, that doesn't mean that I made the right choice for   
xunicode.sty .
Maybe there was no "right" choice to make ?

> But I re-iterate, this would not be the way it should be done if  
> the two characters are distinct.

In Unicode, there should be a code-point that is associated with the  
that the symbol is meant to represent. This should not depend upon the
font used to actually place the symbol onto the "printed page".
Similarly, the macro used to request this symbol, in the LaTeX source
of a manuscript, should be the same whether used in the context of a
roman face or italic one.

Thus  xunicode.sty  should just define a single macro to request the  
code-point corresponding to the concept. If it "looks wrong" in the  
output, then I'd say that the font-designer got it wrong. Maybe he/ 
realise the error and fix it in a revision of the font. However, this
should not affect the (La)TeX source of a document where the concept  
is used.

Of course this is the "ideal" position to adopt.
Reality can dictate something else be done, to make it "look right"
on paper (i.e., within a PDF).

> ...  -- I just don't understand what the purpose of xunicode is  
> exactly. It it's just to collect together a huge number of TeX  
> macros to easy unicode typesetting, then I guess it should contain  
> the TIPA stuff.

It was written primarily to provide compatibility with the source code
of existing documents. This is how it is described by Jonathan in his
TUGboat article:   TUGboat Vol. 26, 2 (2005) pp. 115--124.
from the talk he gave at TUG 2005.

It is also for new documents created by authors using XeTeX for snappy
fonts, but still using the older (La)TeX techniques out of familiarity,
rather than learning how to directly insert characters in UTF8.
This is a legitimate way to work, IMHO, especially in collaboration
with others who may not have access to XeTeX.

> On the other hand, where do you draw the line between accessing  
> unicode characters/accents with glyphs and lots of intricate  
> shorthands?

Does such a line even need to be drawn ?

If the definitive version of the document is the final PDF,
then it doesn't matter what techniques were used to create it.

On the other hand, if it is the (La)TeX source that is to be
regarded as definitive, then drawing such a line might be useful.
For example, this is the case for a manuscript that is passed-on
to an editor for further preparation prior to inclusion in a
Journal, Series, or other Proceedings-like publication.

> Will

There is another side to  xunicode.sty  that hasn't been fully
developed yet:

   it doesn't just simply define some LaTeX macros,
   rather it specifies how some macros should be treated
   "according to the specified font-encoding" (usually 'U').

That is, the way the "Declarations" work in  xunicode.sty  is such
that you can easily use different encodings in different portions
of your document, and the macro-expansions will adapt.

e.g.   \v{Z}  can produce different ways to produce the  Zcaron
when the encoding is 'U', 'TS2' (for Cyrillics), 'T1' or 'OT1'.
All these can co-exist in the same document.

Of course, a user may also define their own customised encoding,
for whatever purpose. (e.g. 'Pin' in a different thread)

If this is done, then it may become desirable to re-read  xunicode.sty
several times, so as to 'Declare' the desired behaviour of the macros
with the different encodings.
This is different to what normally happens with LaTeX packages, where
there is only ever a need to read it once to get the definitions.

At present this re-reading doesn't work properly.
For instance, LaTeX will barf on the few \newcommand definitions near
the beginning of the package, when it is \input for a 2nd time.
Since these don't need to be re-defined, I should devise a new version
which allows smooth re-reading.

Alternatively, all the \Declare...  lines could be moved into separate
files, devoted to particular ranges of Unicode points.
This would be more like Babel, with its .ldf  files.

I don't know whether the above has made it any clearer for anyone
on this list. It certainly should indicate that there are many
directions in typesetting that can be explored --- both in terms
of how authors indicate what they want, and in linking to fonts
to actually provide a representation of it.



Ross Moore                                         ross at maths.mq.edu.au
Mathematics Department                             office: E7A-419
Macquarie University                               tel: +61 +2 9850 8955
Sydney, Australia  2109                            fax: +61 +2 9850 8114

More information about the XeTeX mailing list