[XeTeX] XeTeX hyphenation support for supplementary chars?
Jonathan Kew
jonathan_kew at sil.org
Fri May 11 22:53:03 CEST 2007
On 11 May 2007, at 8:26 pm, Kenneth Reid Beesley wrote:
> "XeTeX, the Multilingual Lion: TeX meets Unicode and smart font
> technologies", Jonathan Kew, TUGboat, Vol. 25 (2005), No. 2.
>
> "Hyphenation support: Along with other character-code-oriented parts
> of TeX, the hyphenation tables in XeTex have been extended to support
> 16-bit Unicode characters. This means that it is possible to write
> hyphenation patterns that use any (Plane 0) Unicode letters, including
> non-Latin scripts as well as extended Latin (accented characters,
> etc.)"
>
> Taken at face value, these statements would seem to indicate that
> one cannot define \lccodes for Deseret Alphabet characters (there is
> an uppercase/lowercase distinction in this alphabet) and that one
> cannot
> define hyphenation tables over supplementary characters.
You are correct.
> Am I stuck? or am I missing something?
These statements are accurate as of XeTeX 0.996, the latest released
version, and so you are currently stuck.
However, this has been changed for version 0.997, currently in
development. While that has not yet been released, the extension to
full Unicode support is present in the 0.997-dev version that you get
if you build from the Subversion repository at <http://
scripts.sil.org/svn/xetex/TRUNK/>.
So you will be able to do this once 0.997 is released, or if you
build from source in the meantime. (Actually, I haven't tested
supplementary-plane hyphenation patterns yet; I'd better do that
before releasing the new version! Please let me know if you do try
this.)
I can think of a possible workaround, if you're not ready to compile
xetex from source: create a font that encodes the Deseret alphabet in
the Plane 0 Private Use Area, and load this font with a font mapping
that converts the true Plane 1 values in your data to the PUA codes.
Then you will be able to define hyphenation patterns in terms of the
PUA codes you're using, even though your actual text remains
correctly encoded in Plane 1. It's a hack, but I believe it should
work. (Untested.)
JK
More information about the XeTeX
mailing list