[XeTeX] Converting legacy encodings to utf-8
Peter Heslin
pj at heslin.eclipse.co.uk
Mon Jul 10 23:41:35 CEST 2006
One of the issues involved with migrating to XeTeX is the
incompatibility between old documents in various legacy encodings and
new documents in utf-8. While it's possible to run the compile the
former with old TeX and the latter with Xetex, there is a constant
problem when you want to copy text from the former to the latter.
The solution I have come up with for this is to use Emacs to convert the
legacy encodings to utf-8. You see, Emacs comes with many "input
methods" which allow you to type \`a to get à (TeX input method) or h('|
to get ᾕ (greek-ibycus4) or <'h| to get ᾕ (greek-babel). There are also
input methods for Cyrillic, and various Asian scripts which correspond
quite exactly to various legacy 7-bit and 8-bit text encodings.
I wrote some code that takes advantage of these methods to translate
text in files rather than keystrokes, and I posted it to the emacs.devel
list. I would be happy to put it up on the web with a detailed
explanation of how to use it for non-Emacs people, if there is interest.
What do other people do to solve this conversion problem?
--
Peter Heslin (http://www.dur.ac.uk/p.j.heslin)
More information about the XeTeX
mailing list