[XeTeX] Strange hyphenation with polyglossia in French
cyril.niklaus at gmail.com
Sat Oct 16 15:21:08 CEST 2010
On 16 oct. 2010, at 19:12, Paul Isambert wrote:
> That's absolutely normal, that's even the reason why we use TeX :)
> TeX builds a paragraph as a whole; if you remove some words at the end of your paragraph, it might change its entire shape.
I sorta knew that at a certain point in time… but it has obviously not registered, since I was surprised by that behaviour!
>> I also noticed that including or not
>> changes things quite a bit while \frenchspacing did nothing obvious. I thought it would deal with spaces around the guillemets etc. but no. I'm wondering why I bothered including it. Is that a benefit from polyglossia?
> \frenchspacing has nothing to with polyglossia,
I was not clear. I was wondering if somehow, although that would have been surprising, polyglossia dealt with spaces around punctuation marks.
> and it is extremely important, even though you might not notice at once. It's a macro inherited from plain TeX, whose effect is to disable extra space after strong punctuation marks (e.g. a period), which extra space is used in (some flavors of) English typography. So keep it, although indeed it doesn't deal with space around guillemets.
Thank you for that clarification.
On 16 oct. 2010, at 19:44, enrico.gregorio at univr.it wrote:
> The "Mapping=tex-text" options makes available all usual TeX ligature
> conventions (`? for the reversed question mark, --- for a dash and so on).
which is why I was surprised to see the aspect of that little paragraph change so much, since it does not have any particular ligatures or dashes etc. Or so I thought until I realised it was all the straight apostrophes that were curling up.
> It's quite subtle, I believe. There are no patterns containing U+2019 (RIGHT
> SINGLE QUOTATION MARK), into which each apostrophe is changed by
> tex-text.map; so the pattern "1informat" comes into play, creating a hyphenation
> point in "l'information" just after the character U+2019.
> Indeed, also "l'alcool" gets hyphenated as "l’-al-cool", as there is the pattern "1alcool"
> on line 126 of hyph-fr.tex
> This is a problem which should be examined by the "hyphenation pattern team":
> all patterns containing the apostrophe should be duplicated with U+2019 in its place.
> It may show its effects also in Italian and all other languages where the apostrophe
> gets a nonzero \lccode for hyphenation purposes.
On 16 oct. 2010, at 20:42, Mojca Miklavec wrote:
> what you
> observe is a "known problem that needs a nice idea to solve it" (or we
> can simply create and load another bunch of patterns) and it's present
> in both XeTeX and LuaTeX (only that it's mapped to quotation mark in
> We would need to double all the hyphenation patterns to account for
> that case (including both apostrophe and quotation marks). An
> alternative would be to "explain to engine" that two characters
> hyphenate in exactly the same way. The latter is possible, but we
> never (managed to) implement it. It might be as simple as one line of
> code though ...
OK, so I understand the nature of the problem now, thanks to all of you.
As much as I would like to find that one line of code, my coding skills are inexistent unfortunately, and I could never produce what the great minds on this list have made. If I somehow reach illumination and find a way to deal with this, I will of course let you know.
On 16 oct. 2010, at 20:57, Jonathan Kew wrote:
> Would setting
> \lccode "2019 = "27
> be any help?
I do have it in the document preamble, to no effect (with straight or curled apostrophes).
In the meantime, the "solution" I used was to change fonts…
More information about the XeTeX