[tex-hyphen] ptex-specific patterns

Mojca Miklavec mojca.miklavec.lists at gmail.com
Mon May 31 17:37:24 CEST 2010


On Mon, May 31, 2010 at 17:20, Arthur Reutenauer wrote:
>
>  2. The input is “šč” (U+0161, U+010D).  It's reencoded as 0xB2, 0xA3
> in EC, which *is* a valid EUC-JP code (corresponding to Unicode
> character U+6A2A, as it is), hence that two-character sequences is
> interpreted as a single Japanese character, and the original input is
> simply lost.

There is one big difference. If you use T1 encoding and try to input
    (^^b2^^a3)
then you will get šč. If you try to use byte string, for example
    (²£)
in Latin1 encoding then you will get
    ** ERROR ** Could not find encoding file "H".

This means that you *will* get the right hyphenation patterns if you
input them with ^^ab^^cd, even if 0xABCD represents a valid Japanese
character. I have just tried it out.

What I still don't understand is the usability of pTeX as a
typesetting engine for some random European language, but as far as
hyphenation patterns are concerned, I see no serious limitation yet.

Mojca



More information about the tex-hyphen mailing list