[XeTeX] How to use EC font encoding in XeTeX?

Mojca Miklavec mojca.miklavec.lists at gmail.com
Sun May 21 22:21:44 CEST 2006


On 5/19/06, Jonathan Kew wrote:
> On 19 May 2006, at 1:21 am, Mojca Miklavec wrote:
>
> > However here's my next question (probably there will be another no as
> > an answer):
>
> Actually, the answer is "yes". :)
>
> > Is it possible to use any other 8-bit input encoding (such as cp1250)
> > or is only utf-8 currently supported? I tried to test it on some old
> > documents, but the behaviour is completely random and depending on the
> > surronding characters.
>
> The input is interpreted as UTF-8 by default, and so the bytes in
> your cp1250 text are (mis)interpreted as being part of UTF-8
> sequences, leading to apparently random results -- not truly random,
> of course, but certainly not useful!
>
> However, you can say
>
>         \XeTeXinputencoding "cp1250"
>
> and then input will then be interpreted as codepage 1250 (and mapped
> to the corresponding Unicode character codes for processing within
> XeTeX).
>
> To be precise, \XeTeXinputencoding will change the interpretation of
> the input bytes beginning at the *next* line of the input file.
>
> There is also \XeTeXdefaultencoding, which is similar, but it does
> not affect the reading of the *current* file; instead, it changes the
> initial encoding for any files that are *subsequently* opened.
> Therefore, you can use this to get XeTeX to read an existing file in
> a legacy encoding, without having to edit that file itself -- just
> set the default encoding from a "driver" file before doing \input.
>
> One caution about setting the default encoding: regardless of the
> \XeTeX...encoding settings, any files that XeTeX writes (with \write)
> will *always* be written as UTF-8. So if you're writing and then
> reading auxiliary files from within the job, these will be UTF-8 even
> if your main input text is a legacy codepage, and you may have to
> take care to switch the default encoding back to utf8 before opening
> such a file.
>
> > ConTeXt creates active characters for 128-255 when working with pdfTeX
> > and maps the characters to \namedglyphs which works OK there, but
> > XeTeX works completely different with input encoding. I approximately
> > understand why it doesn't work here, but is there a trick to make
> > XeTeX work with those 8-bit encodings as well?
>
> There's another possibility, too: if you use
>
>         \XeTeXinputencoding "bytes"
>
> then XeTeX will simply read the input text as byte values 0..255,
> with no attempt to map them to Unicode according to any specific
> codepage. (In practice, this is probably the same as using encoding
> "8859-1" or "latin1".) If you do this, then ConTeXt's active-
> character scheme for handling encodings within TeX macros should
> still work, as you'll be getting the same character codes as standard
> TeX would see.

Thank you for the very precise explanation. This time I should say
"shame on me", because I discovered that these commands are already
described in the documentation as well. (Some time ago I glimpsed
through the documentation and figured out that there were so few
things written there that there was hardly any answer found in it. So
I thought that reading the source or asking on the list was still the
only way to find out how exactly XeTeX works. And I forgot about that
document in the meantime. Sorry again.)


> > I might have something misconfigured or too old version installed (I
> > didn't install .992 yet), so I'm just curious: what do you get if you
> > type
> >     \catcode`ð=\active \defð{^^f0}
> >     ð
> > My version seems to have some problems with ^^f0.
>
> It seems to work for me -- with font ec-lmr10, it prints the ð
> character, as expected. What do you get?

It just "hangs" (stops) at that point and niether gives any error
messages nor does it process further. \eth works OK though (which
means that ConTeXt already did that conversion once) and any other
"hex number" works OK as well, it's just "f0" that causes problems.
Strange. I have to investigate it a bit further.

Thanks,
    Mojca


More information about the XeTeX mailing list