[XeTeX] Puzzling hyphenation with polyglossia and xelatex

Mon Jan 10 11:58:47 CET 2011

I sent this query to the tex-hyphen list a little while ago, but there's
been no solution so far.  I wonder if anyone here can suggest what is going
on?

Thanks,
Dominik

----------

I'm sorry not to have a minimal example for this query.

I'm getting different hyphenation results depending on the order of
language/font invocation.  I find this unexpected.

Thus, if I say

\usepackage{polyglossia}

\usepackage{xltxtra}

\defaultfontfeatures{Mapping=tex-text,Numbers=OldStyle}
\setmainfont{TeX Gyre Pagella}

\setdefaultlanguage[variant=british]{english}

\setotherlanguage{sanskrit} % for transliterated Sanskrit

\newfontfamily\sanskritfont %[Script=Devanagari]

{TeX Gyre Pagella}

%

% Define \sansk{} which is the same as \emph{}, except that it causes
appropriate
% hyphenation for Sanskrit words.

% Use \sansk{} for Sanskrit and \emph{} for English.
\newcommand{\sansk}[1]{\emph{#1}}

then I get some very wrong hyphenations in the English text.  For example,
s-moothly, and other cases with one letter at the end of the line.  This
means lefthyphenmin is being seen as 1 in the English text, where it
shouldn't be.  lefthyphenmin is indeed 1 for the Sanskrit.

If I set up the Sanskrit first, and say,

\usepackage{polyglossia}

\usepackage{xltxtra}

\defaultfontfeatures{Mapping=tex-text,Numbers=OldStyle}

\setotherlanguage{sanskrit} % for transliterated Sanskrit

\newfontfamily\sanskritfont %[Script=Devanagari]

{TeX Gyre Pagella}

%

\setmainfont{TeX Gyre Pagella}

\setdefaultlanguage[variant=british]{english}

% Define \sansk{} which is the same as \emph{}, except that it causes
appropriate

% hyphenation

% for Sanskrit words. Use \sansk{} for Sanskrit and \emph{} for English.
\newcommand{\sansk}[1]{\emph{#1}}

Things are okay.  Well, the English is okay.  The Sanskrit has
lefthyphenmins of 2, but that suits me.

TeXbook 455: "Each whatsit records the current \lefthyphenmin and
\righthyphenmin."  So these settings should change with each \language
change.  They're not global.

Best,
Dominik

PS I'm not using Devanagari, but Roman-script transliteration.  I'd quite
like to be able to say [Script=Latin] (or Roman), to be explicit about this,
but it's disallowed.  Anyhow, that's a different topic.

On 4 January 2011 16:25, Dominik Wujastyk <wujastyk at gmail.com> wrote:

> Yup, works perfectly with xltxtra.  Sorry for these elementary questions!
>
> Thanks,
> Dominik
> <https://www.dropbox.com/referrals/NTIzNzI2MTY5>
>
>
> On 4 January 2011 14:19, Jonathan Kew <jfkthame at googlemail.com> wrote:
>
>> On 4 Jan 2011, at 12:44, Dominik Wujastyk wrote:
>>
>> > Minimal example, run with xelatex and TeXlive 2010:
>> >
>> > \documentclass{article}
>> > \usepackage{polyglossia}
>> > \begin{document}
>> > \showhyphens{helicopter}
>> > \end{document}
>> >
>> >
>> > Why does my log file show
>> >
>> > Underfull \hbox (badness 10000) in paragraph at lines 4--4
>> > [] \EU1/lmr/m/n/10 helicopter
>> >
>> > instead of
>> >
>> > Underfull \hbox (badness 10000) in paragraph at lines 4--4
>> > [] \OT1/cmr/m/n/10 he-li-copter
>> >
>>
>> Because the default LaTeX \showhyphens doesn't work with "native" fonts
>> (i.e. those loaded without TFMs, etc) in xetex, and polyglossia loads
>> fontspec which sets the default font to LM loaded as a "native" unicode
>> font.
>>
>> If you try actually using "helicopter" in text, you'll find that it still
>> hyphenates fine.
>>
>> To fix \showhyphens, try loading the xltxtra package.
>>
>> (There's discussion of this somewhere in the list archives, IIRC.)
>>
>> JK
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/xetex/attachments/20110110/177b0e2f/attachment-0001.html>