[XeTeX] xetex file organization
jonathan_kew at sil.org
Wed Nov 3 16:48:35 CET 2004
On 3 Nov 2004, at 3:04 pm, Bruno Voisin wrote:
> All the rest looks fine to me, but again I'm not a specialist. That
>> (Unicode-compatible versions of hyphenation files;
>> these are designed to still work with standard TeX as well)
> Hopefully at some point in the future, when/if the capability of
> reading files in a specific encoding is added to XeTeX, this directory
> would become unnecessary (as well as the modified version of url.sty
> in texmf.gwtex).
Actually, I've now implemented support for reading files in non-Unicode
encodings (but haven't released a version including this yet). So you
know what's coming, you can say:
(where "encoding-name" is scanned like a filename by XeTeX, with
optional quotes). The "encoding-name" can be one of a set of built-in
auto (the default setting, auto-detects utf8 or utf16 files)
utf16 (platform-native utf16, i.e., big-endian on Mac OS X)
bytes (reads individual bytes directly as character codes 0..255)
or it can be an "internet encoding name" recognized by the Mac OS Text
Encoding Converter; so you can say things like:
etc., and the text will be converted from that encoding to Unicode as
the file is read.
The encoding used to read a file (either \input or \openin) is
determined at the time the file is opened; it can't be changed on the
Note that it may still be necessary to adapt hyphenation files, though,
as many of them are written in terms of specific legacy encodings using
TeX-level mechanisms (active characters, ^^xx sequences, etc.). These
mechanisms won't be affected by the \XeTeXinputencoding setting.
Although such files can safely be read by XeTeX, they may not provide
the appropriate hyphenation rules for text that actually uses the
Unicode character codes for the given language.
In a case like url.sty, yes, I'd guess that simply reading it in Latin1
ought to solve the problems people have had. How best to ensure that
this happens is another question.... I'm still thinking about that.
More information about the XeTeX