[XeTeX] Basic questions about CJK (Unicode) and xelatex

Jonathan Kew jonathan_kew at sil.org
Wed Feb 23 17:36:51 CET 2005

On 23 Feb 2005, at 3:32 pm, Roger Hart wrote:

> However, the problem comes when I have a paragraph (of more than one 
> line) of Chinese. Then I get strange error messages, and the text does 
> not wrap properly but extends off the page. If it is long enough, it 
> eventually does wrap to make a second line.

Aha.... OK, I understand about that. I thought your problem was getting 
Unicode CJK characters to render. This is a different issue.

The problem is that in a long run of Chinese text, there will typically 
not be any spaces. And spaces (or hyphens, or anything else that 
produces a TeX "discretionary" or suitable "glue" or "penalty") are 
needed for line-breaking to be possible. So you're right, simply giving 
XeTeX a paragraph of Chinese is likely to give massive problems of 
overfull lines.

I'd guess that traditional (La)TeX CJK packages, which must be doing a 
lot of interpretation of the character coding in TeX macros, probably 
insert penalties between the characters so as to permit line breaks.

One thing on my "to-do" list for XeTeX is to add an option to 
automatically allow line-breaking within runs of CJK characters, 
similarly to automatic hyphenation of English (but naturally, not 
inserting hyphens!) If/when I get this done, it'll be easy to deal with 
text like this.

Meanwhile, a possible approach would be to make the CJK ideographs into 
"active" characters that insert skip and/or penalty items around 
themselves, to permit line-breaking. This isn't really a good long-term 
solution, but it may be useful for the time being. See the attached 
extended version of the CJK example.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: CJKsample.tex
Type: application/x-tex
Size: 3461 bytes
Desc: not available
Url : http://tug.org/pipermail/xetex/attachments/20050223/1184033b/CJKsample.tex
-------------- next part --------------

More information about the XeTeX mailing list