[XeTeX] Western encoding and XeTeX
Jonathan Kew
jonathan_kew at sil.org
Tue Nov 13 21:45:41 CET 2007
On 13 Nov 2007, at 7:52 pm, Marcel Korpel wrote:
>
> When typing é and ü again, I noticed that the system expects Unicode
> characters. I ploughed through this mailing list and Google & Co., but
> couldn't find a useful solution. As a (former?) TeTeX user I use
> tex-text.tec to convert -- etc. to the correct 'ligatures'. I think
> that using TECkit it must be possible to convert é and ü etc. in the
> same way. Is there perhaps already a 'standard' solution
> (map/tec-file) to workaround this problem?
You can't deal with these in the same way, as a "font mapping" within
xetex; if you have 8-bit Western Latin-1 character codes in your
file, and xetex tries to interpret the file as UTF-8, the proper
codes will be lost. (Really, it should probably stop with an error
message message in this case, rather than simply discarding data.)
My strong recommendation would be to convert the text files to
Unicode (UTF-8), which can easily be done at the command line with
iconv:
iconv -f latin1 -t utf8 oldfilename.tex > newfilename.tex
or similar. If you want to use such files with an 8-bit TeX, you can
just load the inputenc package with [utf8] option; xetex will read
them directly with no special package.
If you really don't want to convert your files to Unicode -- although
that is the One True Way of the future :) -- then you'd have to use
the (rather obscure) \XeTeXinputencoding command to tell xetex to
convert Latin-1 to Unicode as it reads them. If you have
\XeTeXinputencoding "ISO-8859-1" at the beginning of your document,
or at least before any non-ASCII characters such as accented forms
occur, it should work.
Converting to Unicode throughout is the better option in most cases,
however.
JK
More information about the XeTeX
mailing list