[XeTeX] Hyphenation again…

Jonathan Kew jonathan_kew at sil.org
Fri Jul 14 00:24:37 CEST 2006


On 13 Jul 2006, at 4:54 pm, Florian Grammel wrote:

> Dear all,
>
> I've been pulling my hair out all day, because my self-defined  
> hyphenation-exceptions would partly stop working after migrating a  
> project started in LaTeX to XeLaTeX. Now I think, I've pinned down  
> the problem, but I have still no clue what to do about it… might  
> be any of you have suggestion? Am I missing something here?
>
> When I compile the nonsense-example below with LaTeX, the defined  
> hyphenation-rules are properly used in both the paragraphs and the  
> exceptions included by \ger{} and \fre{} respectivly: The  
> hyphenation neatly changes twice in each paragraph.
>
> Using XeLaTeX, only the \selectlanguage{}-command works and sets  
> the hyphenation for the whole paragraph –- the exceptions are  
> ignored.
>
> \foreignlanguage is ignored in the same way.
> This seems not to be language-specific.
> I'm using up-to-date teTeX (i-installer), babel, XeTeX and fontspec  
> on 10.4.7 (PPC).

Looks like you've found a bug in xetex. Note that the critical factor  
is not actually whether you're running standard tex or xetex, but  
whether you're using a TFM-based font like CM or a native system font  
like those you load with fontspec. That's when the language-change  
fails.

My apologies for your hair loss! Now it's my turn to pull some......  
hope I can solve this one before it's all gone.

>
> Another (minor) issue is, that all use of babel's Icelandic in  
> XeLaTeX seems to hyphenate just fine, but always produces the error- 
> message:
>
> (/usr/local/teTeX/share/texmf.tetex/tex/generic/babel/icelandic.ldf)
> Runaway argument?
> {\sob {ҽ{.7}{0}{0}{0}} \DeclareTextCommand {\eob }{T1}{\sob {e}{1}{0 
> \ETC.
> ! File ended while scanning use of \@argdef.
> <inserted text>
>                 \par
> l.134 \ProcessOptions*

I think you'll also find that certain "shorthand" sequences for  
special letters using the double-quote character as an active  
"escape" char won't work as expected.

This is because the icelandic.ldf file includes some Latin-1 accented  
characters, which are not read correctly by xetex (which is trying to  
interpret it as UTF-8). See line 152 and following (at least that's  
where I noticed some; there might be others).

The simple fix would be to replace the literal Latin-1 characters  
with ^^xx hex sequences (which would work for both standard tex and  
xetex).

The use of non-ASCII literal characters in a file like this is  
generally a bad idea, IMO; I suspect the sequences in question  
wouldn't work right for an Icelandic user working with MacRoman as  
their input encoding in standard TeX, either.

JK



More information about the XeTeX mailing list