[XeTeX] Japanese, Chinese, Korean support for Polyglossia

Gerrit z0idberg at gmx.de
Fri Jul 23 20:02:00 CEST 2010


Hello Philip,

what I meant for simplified Chinese as being easy is because of this:

- it is also only written horizontally
- no ruby characters (at least as I know)
- uses arabic digits (e.g. 2010年)

Ok, maybe my axioms for Korean were not really that good.
The alphabet part is actually of no importance, because for the computer 
it is no different than if it were Chinese characters (the Korean 
syllables are all pre-composed in Unicode, about 11000 all in all).

If you compare a Korean text and a Chinese text (here some random stuff 
from Wikipedia) the differences are:

무안부에 속해있던 것이 일제강점기 시대에 목포부로 개칭, 도시 부분만 떨어 
져 나와 항구 도시로 급성장했다. 1897년에 개항하여 일본 제국 본토로의 곡 
물 수탈항으로서 기능하였다. 그러나 일본 제국이 태평양 전쟁에서 패배한 후 
시일이 지나면서 국가개발계획에 상당 부분 소외되어 지속적인 성장을 누릴 
수 없었다.


哪位奈及利亞中場代表球隊在2010年世界盃對希臘一戰因惡意踢向對方托路斯迪斯 
而被紅牌趕離場?

1. Korean uses word spacing
2. Korean uses western punctuation (, and ., these are exactly the same 
characters as in a Western text)
3. It has arabic digits

In contrast to that, Chinese is written without spaces. Still it often 
uses arabic digits. It uses its own punctuation (, and 。 etc., which 
are Full-Width).

The actual chaacters are not a problem, I think. It is no difference for 
the computer if it processes a 년 or a 年.

I think the biggest problem would be the missing word spacing in 
Chinese. I only know about Taiwan, where the line is just breaken 
everywhere, without _any_ rule (a line can also start with 。). I am not 
sure about nicer typography, though. Also, I don’t know about mainland 
China. But the word spacing for Korean just simplifies everything, 
because you can just use almost the same justifying method as with 
western text.

Actually, maybe traditional Chinese would also be no problem (except for 
the missing word spacing), because it can also be written like 
simplified (only with other characters) - meaning horizontally, arabic 
digits etc. Maybe the ruby feature for Chinese is so extremely 
unimportant that it can be ignored. All the calendar stuff is also no 
problem, because it can be written manually (I always wonder - who uses 
the \today method? I never use that, I always write the date manually).


Gerrit

Am 24.07.2010 00:35, schrieb Philip Taylor (Webmaster, Ret'd):
> Not convinced about the last part, Gerrit : to the
> best of my knowledge, there are examples of Simplified
> Chinese that violate each of your axioms for Korean
> (with the possible exception of the punctuation),
> so I'm not really sure how you reached that conclusion.



More information about the XeTeX mailing list