[XeTeX] XeTeX, ConTeXt, and utf-8 hyphenation patterns.
pj at heslin.eclipse.co.uk
Tue Jun 13 00:11:07 CEST 2006
A little while ago, I said that I hoped to convert Dimitrios Filippou's
ancient Greek hyphenation patterns (the elhyphen package) to utf-8, in
order to use them with xetex. Before thinking about starting this work,
I decided to look to see if anyone else had done it, and I came across
something interesting in ConTeXt, which is not a package I normally use.
There appears to be a whole subdirectory in the ConTeXt distribution
that is full of utf-8 hyphenation patterns, including Filippou's ancient
Greek ones, but also including German, French, etc. They are in the
file: http://www.pragma-ade.com/context/current/cont-tmf.zip, in the
Can anyone who knows about ConTeXt explain about where these patterns
come from and how it is that context manages to use these patterns? (I
thought that non-xetex TeX could only use single-byte encoded patterns.)
If there is a script that was used to convert these from the source to
utf-8, is it available? A quick glance at the ancient greek patterns
(in the file lang-agr.pat) shows that there is a bug in the conversion
that I'd like to report and fix.
On a more general level, if both ConTeXt and XeTeX are engaged in
converting legacy TeX hyphenation patterns to utf-8, should they be
coordinated in order to avoid duplication of effort?
Peter Heslin (http://www.dur.ac.uk/p.j.heslin)
More information about the XeTeX