[XeTeX] Encoding of auxiliary files

Jonathan Kew jonathan_kew at sil.org
Wed Mar 12 13:41:00 CET 2008


On 12 Mar 2008, at 11:45 am, Ulrike Fischer wrote:

> Am Wed, 12 Mar 2008 09:57:28 +0000 schrieb Jonathan Kew:
>
>>> as far as I can see Xe(La)TeX writes auxiliary files like  
>>> the .aux and
>>> the .toc-file always in utf-8. Is this true?
>>
>> Yes.
>>
>>> If yes I think the \XeTeXdefaultencoding command is a bit useless as
>>> you will run into trouble if the auxiliary files contains chars
>>> outside the ASCII-range (which is quite probable in the case of
>>> .toc).
>
>> Right; this is quite limited. The main reason it exists is for cases
>> where you need to read an existing file that uses a legacy encoding,
>> and you can't modify the actual input file to declare the proper
>> \XeTeXinputencoding, so you need to set the encoding before giving
>> the \input command.
>
> Yes that makes sense. I got it right that the setting is global, so I
> would have to reset it after the input?

Yes.

> Btw: Is is correct that the following code
>
> \documentclass{scrreprt}
> \usepackage{fontspec}
> \begin{document}
> {\XeTeXdefaultencoding "cp1252"
>  \XeTeXdefaultencoding "auto"}
>
> test
> \end{document}
>
> gives the message
> ### simple group (level 1) entered at line 5 ({)
> ### bottom level
>
> ?
> (It works fine if I add a \relax after the "auto").

It is correct, though admittedly a little surprising. The issue here  
is that encoding names (which are treated like filenames as far as  
TeX's scanner is concerned) need to be terminated somehow. The quotes  
do not necessarily delimit them, because it's possible for a name to  
be constructed from several quoted fragments, as in

   \def\name{"file name"}
   \def\ext{".txt"}
   \input \filename\ext

which should read "file name.txt", despite the scanner seeing "file  
name"".txt".

A space (outside the quotes) would be adequate to terminate the name,  
though \relax may be nice in that it's more visible.

This is really a manifestation of the same "surprise" as you get if  
you try (in either pdftex or xetex) to say

   \setbox0=\vbox{\input filename}

expecting the text from "filename" to be set in a box.

The lesson: always provide a space or \relax to terminate the file  
(or font or encoding) name.

JK



More information about the XeTeX mailing list