[tex-hyphen] ptex-specific patterns

Mojca Miklavec mojca.miklavec.lists at gmail.com
Tue Jun 1 16:30:39 CEST 2010


Dear Tak Yato,

first of all: thanks a lot for your very valuable comments and
explanation that made me understand how the system works in the first
place.

2010/6/1 Tak Yato (ZR) wrote:
> Hello Mojca,
>
> Unfortunately, you cannot make a Japanese character active,
> which makes conversion based on active character expansion
> nearly impossible.

Probably not nearly, but I would call it completely impossible. The
only option I see would be to read the input byte-by-byte and compare
every byte with a set of possible utf-8 characters.

But I've given up that idea now anyway. If a character could be made
active, there would be some hope (doing that is possible in both
LuaTeX and XeTeX), but as it stands now it doesn't make any sense to
even try.

> For example, think of convering UTF-8 byte sequence <C4 8D 69>
> “či” to T1 <A3 69>; what pTeX in sjis mode would see is an
> 8-bit character <C4> followed by a Japanese character <8D69>
> (that is U+7D5E in Unicode).

OK, I see and I admit that I didn't think of that before. But since
the procedure is impossible in the first place, it's not worth losing
time on that at all.

(What still does make me wonder: Assuming that I do manage to get the
(Slovenian) patterns built in successfully. How am I ever going to
type any useful document in Slovenian at all if I'm not even able to
type a rather frequent sequence "či"?)

In any case - I have generated the following files:
http://tug.org/svn/texhyphen/branches/ptex/hyph-utf8/tex/generic/hyph-utf8/patterns/ptex/?pathrev=427
http://tug.org/svn/texhyphen/branches/ptex/hyph-utf8/tex/generic/hyph-utf8/loadhyph/?pathrev=427
If they are of any help, they may be incorporated. Karl prefers to
wait for some time. Personally I don't really mind when and if they
should be used or not, but you are free to test them and/or use them
if you want.

Thanks a lot,
   Mojca



More information about the tex-hyphen mailing list