[XeTeX] Polyglossia: Support for romanization of CJK
Gerrit
z0idberg at gmx.de
Thu Jun 16 12:38:57 CEST 2011
Am 16.06.2011 01:41, schrieb mskala at ansuz.sooke.bc.ca:
> I thought the original poster was talking about segments of text written
> in romanized Japanese as the only script - not phonetic guide texts
> (furigana) attached to Japanese script, nor equivalents in other
> languages. The issues you describe are interesting for knowing how to
> break furigana when words are split at the end of a line, but I don't
> think they're relevant to the original poster's question; it sounded like
> they had a pretty clear, and simple, idea of the hyphenation they wanted
> to use for romanized text, and it would be a fair bit simpler than the
> existing algorithm currently used for languages like English.
>
> I'm not sure that romanized Japanese is used enough for texts of more than
> a few words, to justify a lot of development effort going into figuring
> out how to hyphenate it beyond the original poster's immediate
> application. Do any established standards or traditions exist for such
> hyphenation at all?
Hello,
yes, I thought exactly of such a few words in a western text.
For example, in situations like this: “A town where many hot springs
(/onsen/) are located is Beppu in Kyūshū. ”
Here, you have three Japanese words in a text: onsen, Beppu, Kyūshū. The
hyphenation rules would be quite easy: on-sen, bep-pu, kyū-shū.
Of course, you can do this all manually, but I think if one writes a
text in japan studies or somewhere like that, occurrences of romanized
Japanese can be quite often. Also, if you have a Japanese book in a
reference section, you may need to write the complete title in
romanization, where hyphenation may be needed as well: “Wakabayashi
Masahiro: Taiwan - Henyō shi chūcho suru aidentiti.”
This is often not so much of a problem, because many Japanese words are
not that long (but there are some longer words!), but still, the space
will not be used perfectly.
Because the hyphenation rules seem to be very easy, I think it shouldn’t
be much of a problem to create rules for it. I am not sure if there
exist standards or traditions for that, but I can imagine it. Especially
Pinyin has many rules, so I think there would be romanization for that.
Japanese on the other side is quite easy to structure, so I think
hyphenation, even in absence of specific rules, is clearly straightforward.
Furigana hyphenation etc. is an entire different field: For that, we
would first need Furigana support, which seems to be very difficult (or
at least needs some work). But romanization hyphenation seems easy.
By the way, I think this romanization hyphenation is not only necessary
for Japanese or Chinese, but for any other language as well (Arabic,
Greek, Russian, Thai, etc.). For Japanese and Chinese the advantage is
that you have a universal romanization, but for other languages there
seem to be romanization specific to the target language (e.g. Arabic
romanization for German). Still, there also seems to be some scientific
romanization which could be the standard. After all, I guess Latex (and
Xelatex) is most often used for western scientific texts, which often do
include these romanized foreign terms (if they deal with these areas).
Gerrit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20110616/65a4a751/attachment.html>
More information about the XeTeX
mailing list