[XeTeX] How to make hyphenation work in XeLaTeX?

Mojca Miklavec mojca.miklavec.lists at gmail.com
Sun Jan 21 21:30:21 CET 2007


On 1/21/07, Jonathan Kew wrote:
> Hi Mojca,
>
> No need to apologize for asking questions! Much of this stuff isn't
> obvious....
>
> I'll try to answer two things here, both the font/encoding issues
> (Will, please correct me if necessary!), and the hyphenation question:
>
> > I tried the following:
> >
> > \documentclass{article}
> > \usepackage[slovene]{babel}
> > %\usepackage{fontspec} % do I need this ?
>
> Yes, you probably do; fontspec will change the default typefaces from
> CM to LM (as well as give you a simple interface to select any other
> fonts you want to use). It will also tell LaTeX you're working with
> the "EU1" (Unicode) font encoding....
>
> > \usepackage[EU1]{fontenc}
>
> ...whereas this *only* selects a font encoding, but doesn't change
> the default LaTeX typeface.

Thanks a lot for explanation.

I was confused because of two things:
- I still got a lot of overfull boxes and no hyphens showed (sometimes
hyphens are borrowed from English - for example if language isn't set
properly) - the fact that \showhyphens didn't work was confusing me
most
- I do get OpenType LM fonts without using fontspec. pdffonts shows me:

name                                 type         emb sub uni object ID
------------------------------------ ------------ --- --- --- ---------
LTKUPO+LMRoman10-Regular-Identity-H  CID Type 0C  yes yes yes      5  0

and I also get all the necessary accents (which wouldn't happen if I
would be using Type1).

But misteriously, cmr/lmr mixture changes into lmr only if I include fontspec.

> >
> > \begin{document}
> >
> > % should be:
> > lo-ko-mo-ti[-]va ču-ha-pu[-]ha
> >
> > \def\a{lokomotiva čuhapuha}
> > \a
> > \showhyphens{\a}
> >
> > \end{document}
> >
> >
> > I get some hyphens, but log file (slightly longer text) seems pretty
> > weird (no hyphens shown):
> >
> > Overfull \hbox (0.82622pt too wide) in paragraph at lines 13--14
> > \EU1/lmr/m/n/10 prodajalna ne moralizirajo. Moja
> > [1]
> > Underfull \hbox (badness 10000) in paragraph at lines 17--17
> > [] \EU1/cmr/m/n/10 lokomotiva čuhapuha
> > [2] (./latex-slo.aux)
> >
> > I also don't understand why there is "cmr" in the last line. (from
> > \showhyphens).
>
> Because CMR is LaTeX's default, and nothing has overridden that.

See the first line. It's lmr.

> However, because the EU1 encoding isn't defined for CMR, you got LMR
> substituted. I expect there's a message such as
>
>         LaTeX Font Warning: Font shape `EU1/cmr/m/n' undefined
>         (Font)              using `EU1/lmr/m/n' instead on input line 8.
>
> somewhere in your log. (If this substitution hadn't happened, the
> Unicode "č" would have disappeared as it doesn't exist in CMR10;
> you'd see a "Missing character" message if \tracinglostchars is
> enabled.)

That didn't happen since I got the proper LM font (if I shouldn't, I
won't bother because of that ;).

> So you want to \usepackage{fontspec} so that LM becomes the default;
> this will eliminate the warnings about substituting LM for CM. And
> then you won't need to \usepackage[EU1]{fontenc}, as fontspec will
> take care of that for you.
>
>
> Now, as for hyphenation: it'll be working fine, I believe, with the
> \usepackage[slovene]{babel} declaration as you have it. You can see
> hyphens if you add more text:
>
> \def\a{lokomotiva čuhapuha }
> \a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a
>
> gives me a paragraph with hyphens on lines 1, 4, 5 and 6.
>
> However, the standard LaTeX \showhyphens{} macro does not work to
> display hyphenation points when using OpenType fonts in XeTeX. This
> is the result of the different processing model and internal
> structures involved, and \showhyphens is something of a "hack". See
> <http://www.tug.org/pipermail/xetex/2006-September/005176.html> for
> an alternative, more xetex-friendly version.

Thanks a lot. I wasn't paying enough attention to that post.

So it seems that it nevertheless works. I was a bit suspicious (I
often started redefining new hyphenation points for some words, when I
figured out that language wasn't set properly), but it seems OK and
makes sense now.

Thanks again,
   Mojca


More information about the XeTeX mailing list