[XeTeX] How to use EC font encoding in XeTeX?
Jonathan Kew
jonathan_kew at sil.org
Fri May 19 11:44:18 CEST 2006
On 19 May 2006, at 1:21 am, Mojca Miklavec wrote:
> However here's my next question (probably there will be another no as
> an answer):
Actually, the answer is "yes". :)
> Is it possible to use any other 8-bit input encoding (such as cp1250)
> or is only utf-8 currently supported? I tried to test it on some old
> documents, but the behaviour is completely random and depending on the
> surronding characters.
The input is interpreted as UTF-8 by default, and so the bytes in
your cp1250 text are (mis)interpreted as being part of UTF-8
sequences, leading to apparently random results -- not truly random,
of course, but certainly not useful!
However, you can say
\XeTeXinputencoding "cp1250"
and then input will then be interpreted as codepage 1250 (and mapped
to the corresponding Unicode character codes for processing within
XeTeX).
To be precise, \XeTeXinputencoding will change the interpretation of
the input bytes beginning at the *next* line of the input file.
There is also \XeTeXdefaultencoding, which is similar, but it does
not affect the reading of the *current* file; instead, it changes the
initial encoding for any files that are *subsequently* opened.
Therefore, you can use this to get XeTeX to read an existing file in
a legacy encoding, without having to edit that file itself -- just
set the default encoding from a "driver" file before doing \input.
One caution about setting the default encoding: regardless of the
\XeTeX...encoding settings, any files that XeTeX writes (with \write)
will *always* be written as UTF-8. So if you're writing and then
reading auxiliary files from within the job, these will be UTF-8 even
if your main input text is a legacy codepage, and you may have to
take care to switch the default encoding back to utf8 before opening
such a file.
> ConTeXt creates active characters for 128-255 when working with pdfTeX
> and maps the characters to \namedglyphs which works OK there, but
> XeTeX works completely different with input encoding. I approximately
> understand why it doesn't work here, but is there a trick to make
> XeTeX work with those 8-bit encodings as well?
There's another possibility, too: if you use
\XeTeXinputencoding "bytes"
then XeTeX will simply read the input text as byte values 0..255,
with no attempt to map them to Unicode according to any specific
codepage. (In practice, this is probably the same as using encoding
"8859-1" or "latin1".) If you do this, then ConTeXt's active-
character scheme for handling encodings within TeX macros should
still work, as you'll be getting the same character codes as standard
TeX would see.
> I might have something misconfigured or too old version installed (I
> didn't install .992 yet), so I'm just curious: what do you get if you
> type
> \catcode`ð=\active \defð{^^f0}
> ð
> My version seems to have some problems with ^^f0.
It seems to work for me -- with font ec-lmr10, it prints the ð
character, as expected. What do you get?
JK
More information about the XeTeX
mailing list