[XeTeX] [tex-hyphen] Hyphenation of polytonic Greek (expressed in Unicode)

Khaled Hosny khaledhosny at eglug.org
Fri Sep 13 01:35:26 CEST 2013


On Thu, Sep 12, 2013 at 07:20:30PM -0400, Mike Maxwell wrote:
> In general, word breaking in scripts that don't indicate word
> boundaries is a partly unsolved research problem in computational
> linguistics--and from what I've heard, native speakers often
> disagree.  (If you think that's odd, you might consider 'doghouse'
> vs. 'dog house' in English...)  So I suppose it's not surprising if
> this doesn't work as well in XeTeX as one might hope.

As I said, this is all handled by ICU (or Graphite, for Graphite fonts).
The documentation was not that clear last time I looked into it, but it
is not something I fully understand anyway:
http://userguide.icu-project.org/boundaryanalysis
http://www.icu-project.org/apiref/icu4c/classicu_1_1BreakIterator.html

Regards,
Khaled


More information about the XeTeX mailing list