[XeTeX] Follow-up on CJK (Unicode) and XeTeX (xelatex)
Jonathan Kew
jonathan_kew at sil.org
Thu Feb 24 18:34:34 CET 2005
On 24 Feb 2005, at 5:00 pm, Roger Hart wrote:
> First, XeTeX is absolutely amazing in how easy it is to set up fonts.
> ..... Using XeTex, simply by typing in a font name, it is possible to
> change, for example, to the 60,000+ character font Simsun (Foundry
> Extended), or to any other style of Chinese characters.
So nice to know that this works, even for people using completely
different fonts and scripts than those I work with; making standard
fonts easier to use was one of the goals of XeTeX. :-)
> Unfortunately, the wonderful package ps4pdf does not work under
> xelatex. I assume it would not be hard to fix or find a work-around.
Am I right in thinking that this is a package to allow pdflatex to
include PS graphics, by converting them to pdf behind the scenes? If
so, it ought to be possible for one of the LaTeX experts out there to
figure out how to configure it to work with XeLaTeX, given that XeTeX
supports pdf graphics, and supports the \write18 mechanism to run shell
commands (provided you enable it in your configuration).
> Did I overlook something simple here, but is there a way to make
> xelatex in Computer Modern (no roman font specified) recognize the
> Unicode characters that seem to work under pdflatex, such as m-dashes,
> smart quotes, German and French characters?
You mean the use of these characters as literal Unicode in the input
file? For that to work under standard LaTeX, I assume you use something
like \usepackage[utf8]{inputenc}, which would map the byte codes that
represent the Unicode characters in the input file onto LaTeX commands
to access the appropriate CM characters.
That won't work as it stands under XeLaTeX, because you don't use the
inputenc package to interpret UTF8 byte sequences; the em-dash,
accented characters, etc., are simply individual characters, just like
the ASCII ones.
What you could do is make these characters \active, and \def them to
generate the appropriate output, e.g.:
\catcode`—=\active \def—{---} % em-dash
\catcode`ß=\active \defß{\ss{}} % es-zet
\catcode`¿=\active \def¿{?`} % spanish begin-question
% ...etc... for as many Unicode characters as you want to support in
CM output
This then becomes dependent on the CM font encoding. I'm sure there's a
True LaTeX Way to do it, which someone like Ross would know, so that it
interacts properly with the various legacy LaTeX font setups. But in
general, I'd suggest it's simplest to use a Unicode font for your body
text!
> Finally, if I might very humbly submit a request -- and I am truly
> humbled by the work you've done on XeTeX -- could you please make
> proper wrapping of CJK a priority for the next release? I think that
> line-wrapping is a very basic capability, it is easy for someone
> installing XeTeX, on finding CJK lines don't wrap, to just assume that
> XeTeX is not compatible with CJK.
I'm taking a look at the place where that code needs to be
inserted..... stay tuned for developments. ;-)
JK
More information about the XeTeX
mailing list