[XeTeX] \hyphenation{} and combining diacritics

Joshua and Amy josh.ruthamy at gmail.com
Fri Jul 8 22:50:07 CEST 2011


So, I guess I was foolish to hope that Google has figured out how to return
results that have non-identical but equivalent strings?

I hope it's not too off-topic for this list, but can you point me to any
good resources on normalization (is there a straightforward automation for
someone who doesn't do scripting? am I supposed to use decomposed
characters?)?

Thanks.

Josh

On Fri, Jul 8, 2011 at 3:11 PM, maxwell <maxwell at umiacs.umd.edu> wrote:

> On Fri, 8 Jul 2011 15:00:42 -0500, Joshua and Amy <josh.ruthamy at gmail.com>
> wrote:
> > I'm creating some hyphenation rules for Jarai texts that I'm
> > interlinearizing. Here's the problem: In various texts, a complex
> character
> > such as LATIN SMALL LETTER A WITH BREVE might be encoded as a single
> code
> > point (U+0103) or as a combination of code points (LATIN SMALL LETTER A:
> > U+0061 plus COMBINING BREVE: U+0306).
>
> Can't (shouldn't!) you pass your texts through a Unicode normalization
> process?  Otherwise search on them might not work either, depending on how
> smart your search tool is.
>
>   Mike Maxwell
>
>
> --------------------------------------------------
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20110708/c7f72d8b/attachment.html>


More information about the XeTeX mailing list