[XeTeX] Issue with CJK in pdf build

Chris Jones cjns1989 at gmail.com
Thu Nov 19 05:20:36 CET 2009


On Wed, Nov 18, 2009 at 09:25:43PM EST, Wilfred van Rooijen wrote:

> There are many "tex-engines". The oldest is TeX itself. LaTeX is a
> "macro-package", which provides extra commands etc for TeX. But there
> are more TeX-engines, each with specific extra functionality (from the
> top of my head, there may be some inaccuracies):

> - etex: able to set text Right-to-Left as well as Left-to-Right
> - Omega: more internationalization support (Lambda is latex for omega)
> - pdftex: sets text directly into PDF, not the older DVI format of TeX
> (pdflatex is latex for pdftex)
> - ptex: a Japanese variant capable of setting top-to-bottom
> right-to-left as well as left-to-right top-to-bottom. As the old
> versions of TeX only supported 127 character alphabets, ptex is a
> different branch on the tex-tree to support the JIS-set of characters
> (6100+ characters)
> - several others
> - none of these flavors of tex are able to read UTF-8 or use Unicode
> fonts. Only xetex can do that.

> The moral of the story: in the Good Old Days, you could only use, say,
> Latin characters in a TeX document, and only with extra work was one
> capable of including CJK, Hindi, Urdu, Sanskrit, Thai or whatever. One
> document with Chinese and Latin was possible, but Chinese, Korean and
> Hindi in one document was out of the question, or required some
> "non-standard" packages and stylefiles. Therefore the warning in the
> toolchain: use only one characterset, otherwise your TeX will (likely)
> explode. I guess the maintainers of the toolchain should add some
> warnings about xetex and the possibilities of completely unlimited use
> of characters.

That makes sense. 

> > No..!! What I expected is that the _asciidoc/a2x tool chain_ what to
> > do. But on second thoughts, since my locale is set to en_US.UTF-8 I
> > have a feeling that's what it's doing and chooses a font that has
> > all the glyphs that would be of interest, including arrows, box
> > drawing, etc. on top of the ASCII range.

> There are not many fonts that actually have glyphs for *all* the
> characters presently defined in the Unicode tables. I think only
> code2000 comes close, and maybe Cyberbit. So at present, there is no
> "catch all" font.

> > In terms of functionality, probably. But then, is there an
> > alternative?

> I don't know what this toolchain does :-)). But given the tremendous
> amount of text-processors in the world, I'd say, yes, there is an
> alternative :-))

I wouldn't know of course, only that in the OSS world the tool chains
that I have seen use some for or other of TeX to build .pdf's.

CJ


More information about the XeTeX mailing list