[tex-hyphen] Loading patterns twice, OT1 and apostrophe
Jonathan Kew
jonathan_kew at sil.org
Sat Jun 28 11:14:05 CEST 2008
On 28 Jun 2008, at 9:02 am, Taco Hoekwater wrote:
> Arthur Reutenauer wrote:
>>> I've been thinking: Perhaps the final solution
>>> is to
>>> do away with \lccode and \uccode completely and instead base the
>>> system on unicode properties?
>> You don't say :-)
>
> Well, there is a downside also: an interface to the unicode properties
> would have to be written too, lest we loose flexibility. TeX users
> are used to being able to modify everything, so a static database
> won't do.
Operations such as case-folding must allow "tailoring" because the
properties in the UCD are defaults, not necessarily correct for every
language. (Consider the casing behavior of i in Turkish, to take one
well-known example.)
And we mustn't forget that users may need to provide properties for
PUA codepoints they're using, even if they don't normally need to
modify standard Unicode properties.
>
>
>> The irony here is that LuaTeX doesn't complain about duplicate
>> patterns anymore since the hyphenation-handling code moved over to
>> libHnj last October, and part 43 of the original TeX code disappeared
>> entirely; Taco, can you comment about that?
>
> I could have added such testing code, but it seemed a bit pointless.
> Duplicate patterns are harmless after all, it just wastes a few CPU
> cycles.
There are two slightly different cases, and they might merit
different handling. Truly duplicated patterns
a1b
a1b
could be silently ignored as harmless, or perhaps a warning logged;
on the other hand, patterns that have the same sequence of letters
but different hyphenation weights
a1b
a2b
should probably be reported as "conflicting" rather than "duplicate".
(TeX does not currently distinguish between these two situations, it
just gives the "duplicate" error for both.)
JK
More information about the tex-hyphen
mailing list