# [XeTeX] [tex-hyphen] Hyphenating words with apostrophes in French

Mojca Miklavec mojca.miklavec.lists at gmail.com
Mon Nov 17 16:27:41 CET 2008

On Mon, Nov 17, 2008 at 3:39 PM, Jonathan Kew wrote:
> On 17 Nov 2008, at 14:06, Manuel Pégourié-Gonnard wrote:
>
>> Hi,
>>
>> (X-posting on the XeTeX and hyphenation mailing lists)
>>
>> I'm under the impression that words following an apostrophe cannot be
>> hyphenated in French using XeLaTeX, xltxtra and babel with appropriate
>> hyphenation patterns. In the following example, hyphenation is correct
>> using pdftex, but under XeTeX the first paragraph has an overfull \hbox.
>
> You're right, this is a problem. The reason is that the Unicode
> right-single-quote (or apostrophe) character U+2019 has not been given a
> non-zero \lccode, and therefore it is considered a nonletter by the
> hyphenation routine.
>
> If you put
>
>  \lccode"2019="2019
>
> into your document, you should get the hyphenation you expect.
>
> I'm not sure whether this should be built into the xe(la)tex format,
> controlled by babel or polyglossia depending on the language in use, or
> what.... suggestions are welcome.

In some languages that character is never treated like a letter, so
hardcoding it to format would be a bad idea in my opinion. Maybe
babel/polyglossia are the proper place to solve that.

However one still needs to solve the basic problem: there are
currently no hyphenation patterns defined for apostrophe. We have
started the discussion this spring/summer, but never came to any
conclusion (or rather: nobody took the challenge and implemented it).

Duplicating patterns is doable (if you tell me that I should do it, I
will implement it), but seems like yet another ugly hack to me. One
thing that I still consider clean is replacing all the "27 with "2019
in patterns themselves and properly read the input in 8bit engines.
But that's not the final answer since people might just as well
complain that "27 doesn't hyphenate properly in XeTeX.

I would *much more* prefer explaining to hyphenating engine that "2019
should be treated identical to "27 if that was possible.

What is the state of "\savinghyphcodes" suggestion?

Mojca