[XeTeX] Strange hyphenation with polyglossia in French

Mojca Miklavec mojca.miklavec.lists at gmail.com
Thu Oct 21 18:17:11 CEST 2010


On Thu, Oct 21, 2010 at 17:45, Philip Taylor (Webmaster, Ret'd) wrote:
>
> Barry MacKichan wrote:
>>
>> Weren't these called 'code pages'?
>
> Not unless you know something I don't, Barry
> (which is more than probable !).
>
> A document written in code page X could not
> be differentiated from a document written in
> code page Y as far as I know, whereas in my
> putative "Omni-code [tm]" the code page would
> be implicit in the encoding of each character.

And how would you type such text? How would you trigger language
change? How would you treat words where each character comes from a
different language (in a text that somebody else sends you)? Does my
name change when I move from Slovenia to UK? What language tag would
you use to type my name? And what about for my cousin that lives in
another country? Would her family name have a different language tag?
What language tag would you use to write "François" in English texts?
Or would you just switch to French to write the letter ç and type
everything else in English? How would you do the conversion from old
texts into your new, perfect encoding? What language tag would you use
to type Montenegrin? Would you change the plane every few years when
the country splits into smaller pieces? And then somebody would
copy-paste some parts of text and then the result would have 30% of
text in Serbo-Croatian, 40% would use Serbian, 25% would use
Montenegrin, ... And of course Google would know exactly what language
you are searching in, so it would not return you any results if the
original webpage uses a different language plane for their product
descriptions (for example Australian English instead of British
English).

Let's say that I type:
    \begin{document}
    \def\mojukaz#1{some weird #1}
    \def\mycommand#1{in nekaj nenavadnega #2}
    \def\myukaz{xyz}
    \section{naslov tega poglavja}
What language plane should be used for TeX commands? I will probably
be using Slovenian keyboard to type the text, so \mojukaz should be
typeset in Slovenian plane, but wait ... \mycommand is English, so
that should be English, right? \myukaz is half English-half Slovenian,
so first two letters English and the rest Slovenian. But of course I
don't really mind changing the plane in the middle of commands, so
when you would try to change part of my document, you would have to
get the plane exactly right to be able to use the command.  And the
braces? Oh joy ... how happy TeX would be interpreting those commands.
I can also imagine myself typing in Word. Note that I change the
keyboard whenever I write in TeX - I use US keyboard to type \{}[],
... while I use Slovenian keyboard to write čšž. I remember that Word
was very clever about that and was changing the language as I typed.
So if I switched to US keyboard inside doc document just for the sake
that I was more comfortable typing some characters in it, Word happily
kept changing the language to US and underlining misspelled words,
even if it was Slovenian text.

Mind that you don't even care to write your name properly. You write
    (Webmaster, Ret'd)
instead of using the proper
    (Webmaster, Ret’d)
with single quotation mark. You don't care to use “proper quotation
marks” in the text you type. Don't try to tell me that you would care
enough to change the language when you would be typing foreign names.

Can we please close this off-topic discussion and solve the problem
with \savinghyphcodes instead?

Mojca



More information about the XeTeX mailing list