[XeTeX] Strange hyphenation with polyglossia in French

Philip Taylor (Webmaster, Ret'd) P.Taylor at Rhul.Ac.Uk
Wed Oct 20 09:47:32 CEST 2010

Khaled Hosny wrote:

> On Wed, Oct 20, 2010 at 12:21:12AM +0200, Mojca Miklavec wrote:

>> Arthur also reminded me that one might want to treat scedilla and
>> scommaaccent as equivalent characters for Romanian,

> Lately, I've been told that Romanians are now strongly against this
> scedilla=scommaaccent thing being legacy artifact, and that continuing
> to support it is causing harm than good.

I don't think it's up to us (the TeX/XeTeX/LuaTeX/hyphenation community)
to police such things : if Unicode implements this equivalence relationship
then so should we.  Part of the specification currently reads :

> In Turkish and Romanian, a cedilla and a comma below sometimes replace one another
> depending on the font style, as shown in example 4 in Figure 7-1. The form with the cedilla
> is preferred in Turkish, and the form with the comma below is preferred in Romanian. The
> characters with explicit commas below are provided to permit the distinction from characters
> with a cedilla. Legacy encodings for these characters contain only a single form of each
> of these characters. ISO/IEC 8859-2 maps these to the form with the cedilla, while ISO/IEC
> 8859-16 maps them to the form with the comma below. Migrating Romanian 8-bit data to
> Unicode should be done with care.

> In general, characters with cedillas or ogoneks below are subject to variable typographical
> usage, depending on the availability and quality of fonts used, the technology, and the geographic
> area. Various hooks, commas, and squiggles may be substituted for the nominal
> forms of these diacritics below, and even the directions of the hooks may be reversed.
> Implementers should become familiar with particular typographical traditions before
> assuming that characters are missing or are wrongly represented in the code charts in the
> Unicode Standard.

** Phil.

More information about the XeTeX mailing list