[XeTeX] Japanese Characters in PDF do not match thosein source file.

Fri Aug 20 15:00:28 CEST 2010

Am Fri, 20 Aug 2010 20:39:12 +0900 schrieb Andrew A. Adams:

>>> I recently upgraded my Fedora Core 10 to Fedora Core 13. I'm getting a very 
>>> strange behaviour from processing latex files including Japanese text and 
>>> processed using xelatex. I've created a minimal input source file which 
>>> demonstrates the problem, which is that the unicode characters in the input 
>>> file are not the ones that appear in the output. It's possible that somehow 
>>> I'm getting Chinese characters instead of the Japanese ones in my original 
>>> file. I create my files in xemacs, and set the buffer encoding to UTF-8. I 
>>> use a script to process the file using xelatex with my default options:
>>> 
>>> xelatex -interaction=nonstopmode -output-driver="xdvipdfmx -p a4 -V5 " $1.tex 
>>> && acroread -tempFile $1.pdf

>>> Attached are the sample tex file, the resulting output file, the log file 
>>> from manual xelatex processing and the output from manual xdvipdfmx 
>>> processing.

>> Don't use inputenc with xelatex. Never! inputenc is meant for
>> 8-bit-machines. It breaks with xelatex. If your file is utf8 or
>> utf16 there is no nead to declare the encoding. 

> inputenc is not the problem. 

After some thoughts, I would say that inputenc can't break japanese
as it doesn't reach so far. But it will break characters like the
german umlauts. So don't use it with xelatex. It will not solve any
problem. 

> It persists even when I take out that line, and 

I have now found a version of IPAmincho here
http://lx1.avasys.jp/OpenPrintingProject/openprinting-jp-0.1.3.tar.gz
and can't reproduce your problem. The glyphs looks like the glyphs
shown by emacs (and different than the glyph in your pdf). I'm using
here XeTeX, Version 3.1415926-2.2-0.999.7 (MiKTeX 2.7)

Put \XeTeXtracingfonts=1 in your document and call xelatex with the
option

--output-driver="xdvipdfmx -vv"

The first will put informations about the fonts used by xetex in the
log and the second will show you the fonts used by xdvipdfmx.

> when I have the file encoded in UTF-8 (sorry for the UTF-16 version I posted 
> earlier).

utf-16 is not a problem: xetex and emacs can handle it. One only
needs to know that this is the encoding of the file. 

-- 
Ulrike Fischer