Petr Sojka sojka at fi.muni.cz
Fri Mar 18 20:26:50 CET 2016

On Thu, Mar 17, 2016 at 06:42:41PM +0000, Arthur Reutenauer wrote:
>   I did moot the idea of enhancing the hyphenation algorithm with an
> equivalence table (I came across several use cases that would benefit
> from that), but someone would need to work on it, of course.
This idea, probably first articulated more than two decades ago
at EuroTeX (Arnhem), 1995, could be used in an etex-capable engine
using etex's \savinghyphcodes:
\begingroup \savinghyphcodes1
\lccode`\é=`\e % setting the equivalence table using \lccode,
               % mapping `\é to canonical character `\e
% input hyphenation patterns
I doubt it works in xetex with more than 256 character
equivalence classes needed for full Unicode, though.
But it could be used for any set of languages/dialects 
with less than 256 characters/classes used in hyphenated words.
All the best,
> > although only the ones with oxia would be needed for
> > "properly encoded" classical greek
>   Sorry, no.  Ancient Greek is no more "properly encoded" using the
> characters in the U+1F00 range than with the ones in the block starting
> at U+0370.  According to everything I've heard on the subject, encoding
> two series of characters for that diacritic, whatever we call it, was
> simply a mistake and they should really be considered completely
> equivalent (that's why they're canonically equivalent in Unicode, of
> course).
> 	Best,
> 		Arthur

