[XeTeX] A small tip

Ross Moore ross at ics.mq.edu.au
Fri Nov 18 03:32:59 CET 2005

Hi Will,

On 18/11/2005, at 11:47 AM, Will Robertson wrote:

>> These are quite old -- all from 2001.
>> What versions do you have ?
> (/usr/local/teTeX/share/texmf.tetex/tex/latex/base/inputenc.sty
> \ProvidesPackage{inputenc}
>    [2004/02/05 v1.0d Input encoding file]
> (/usr/local/teTeX/share/texmf.tetex/tex/latex/base/utf8.def
> \ProvidesFile{utf8.def}
>    [2004/02/09 v1.1b UTF-8 support for inputenc]
> And ucs.sty is not even loaded. I recall reading that utf8 encoding  
> in inputenc had deprecated it to a certain extent, but I can't  
> remember the details. (And utf8x goes even further...)
> Still, I'm very surprised that they would have changed from hex to  
> decimal like that. I wonder if that was somehow related to ucs?

Yes,  \DeclareUnicodeCharacter  is defined within  ucs.sty  on my  

Looks like I need to update  inputenc.sty
and all its support files, to bypass that older method.

>>> On a related note, I thought xunicode might provide an equivalent  
>>> to inputenc's \DeclareUnicodeCharacter,
>>> but it seems that xunicode's \DeclareUTFcharacter is for a  
>>> different purpose (at least, I couldn't get it to do what I expect).
>>   Xunicode's \DeclareUTFcharacter  translates *into* Unicode for  
>> the output.
>>   Inputenc's \DeclareUnicodeCharacter  translates *out of* UTF8  
>> into a TeX macro.
> That's how I ended up assuming it must work. In retrospect,  
> xunicode's is perhaps a confusing name for that command, then.  
> Would \DeclareUTFmacro be better?

No, because its purpose is not so much declaring/defining the macro.
It is meant to be a "compatibility" package for existing LaTeX  
And it is meant to change the output into Unicode code-points, rather
than relying on legacy-encoded fonts.

If previously your document was written to use macros, defined in other
packages such as the  inputenc.sty  modules, the either access
or emulate non-ascii characters, then now you don't need those packages
with XeTeX.

The  \DeclareUTFcharacter  produces the UTF8 (Unicode) point for
a character, using the same macro-name that your were already using.
Furthermore, there is no documentation on these macro names, since
ideally *all of them* are deprecated --- ultimately the strategy
for new documents with XeTeX is to type the UTF characters directly
into the document source.

Furthermore, \DeclareUTFcharacter  does its work "nicely".
That is, you only get the UTF8 character when the input-encoding
is 'U'. So you need to use something like what  \setromanfont
does, to state that you have a font with more capabilities than
the old 8-bit fonts used by traditional TeX systems.

  xunicode  does not supply these fonts, nor access to such fonts,
whereas the inputenc-modules *do* supply both macros and the way
to access requisite fonts.

The point is that you want to be able to switch fonts, with
whatever implications this has for encodings, without having
to alter the body of your document.
This is possible using the (robust) macro-name strategy, of having
different expansion according to the currently required encoding,
that applies also with the PD1 encoding --- such as I described
in another part of my previous email ...

>>> (I'll never remember when to do hex and when to do decimal...is  
>>> the preceding "x" supposed to remind me? Why is it there, out of  
>>> curiousity?)
>> Yes; it's just a reminder that this is hex, not decimal.
>> It plays no role at all in the processing.
> Okay...I suppose because I can never remember which to use,  
> standardising on a single number base for referring to glyph slots  
> is a good idea.
>>> Well, it wouldn't be so bad (are all the kerns and nobreaks,  
>>> etc., stripped out by hyperref?),
>> No, it doesn't work that way.

    ... which you didn't include here !

> Oh, of course! That's a really nice solution. (Stripping out TeX  
> primitives sounds very unpleasant, actually.)

>>>> ([hxetex.def]'s something that I'll try to provide sometime.)
>>> Let me know if you need help...I can't promise anything, of  
>>> course :)

I need help in finding the time to look at this!
It's exam-marking time here.

>>> Speaking of your packages, I wonder if it would be useful to  
>>> provide a "lean" version of xunicode as a package option so as to  
>>> only load the necessary portions of the unicode characters you  
>>> define.
>> Not in the near future, sorry.
>> There are things missing from XeTeX support which have a higher
>> priority than reorganising what *is* available.
> Would it be a good idea to talk about the missing things so that  
> people have an idea of what is going on? I'd be happy to help out  
> some, as I hope I've demonstrated. I suspect the amount of time you  
> put into xunicode is very much under-appreciated, not to mention  
> color and graphicx support, which "just work" without even people  
> thinking about it.

Well, on researching what's going on with  ucs.sty  I discovered
the package-support file  ucsencs.def

\ProvidesFile{ucsencs.def}[2002/02/16 Fixes to fontencodings LGR,  
LHE, T3]

This gives macro-names for a lot more characters, such as modern Greek
and Hebrew.

These *should* be added to  xunicode.sty .
But in doing so, it would indeed be useful to have a modular structure,
with options for loading just those bits which are needed.
This is what you asked for, so I may up its priority.



> Will
> _______________________________________________
> XeTeX mailing list
> postmaster at tug.org
> http://tug.org/mailman/listinfo/xetex

Ross Moore                                         ross at maths.mq.edu.au
Mathematics Department                             office: E7A-419
Macquarie University                               tel: +61 +2 9850 8955
Sydney, Australia  2109                            fax: +61 +2 9850 8114

More information about the XeTeX mailing list