[XeTeX] How to make hyphenation work in XeLaTeX?

Jonathan Kew jonathan_kew at sil.org
Sun Jan 21 20:36:25 CET 2007


Hi Mojca,

No need to apologize for asking questions! Much of this stuff isn't  
obvious....

I'll try to answer two things here, both the font/encoding issues  
(Will, please correct me if necessary!), and the hyphenation question:

> I tried the following:
>
> \documentclass{article}
> \usepackage[slovene]{babel}
> %\usepackage{fontspec} % do I need this ?

Yes, you probably do; fontspec will change the default typefaces from  
CM to LM (as well as give you a simple interface to select any other  
fonts you want to use). It will also tell LaTeX you're working with  
the "EU1" (Unicode) font encoding....

> \usepackage[EU1]{fontenc}

...whereas this *only* selects a font encoding, but doesn't change  
the default LaTeX typeface.

>
> \begin{document}
>
> % should be:
> lo-ko-mo-ti[-]va ču-ha-pu[-]ha
>
> \def\a{lokomotiva čuhapuha}
> \a
> \showhyphens{\a}
>
> \end{document}
>
>
> I get some hyphens, but log file (slightly longer text) seems pretty
> weird (no hyphens shown):
>
> Overfull \hbox (0.82622pt too wide) in paragraph at lines 13--14
> \EU1/lmr/m/n/10 prodajalna ne moralizirajo. Moja
> [1]
> Underfull \hbox (badness 10000) in paragraph at lines 17--17
> [] \EU1/cmr/m/n/10 lokomotiva čuhapuha
> [2] (./latex-slo.aux)
>
> I also don't understand why there is "cmr" in the last line. (from
> \showhyphens).

Because CMR is LaTeX's default, and nothing has overridden that.  
However, because the EU1 encoding isn't defined for CMR, you got LMR  
substituted. I expect there's a message such as

	LaTeX Font Warning: Font shape `EU1/cmr/m/n' undefined
	(Font)              using `EU1/lmr/m/n' instead on input line 8.

somewhere in your log. (If this substitution hadn't happened, the  
Unicode "č" would have disappeared as it doesn't exist in CMR10;  
you'd see a "Missing character" message if \tracinglostchars is  
enabled.)

So you want to \usepackage{fontspec} so that LM becomes the default;  
this will eliminate the warnings about substituting LM for CM. And  
then you won't need to \usepackage[EU1]{fontenc}, as fontspec will  
take care of that for you.


Now, as for hyphenation: it'll be working fine, I believe, with the  
\usepackage[slovene]{babel} declaration as you have it. You can see  
hyphens if you add more text:

\def\a{lokomotiva čuhapuha }
\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a\a

gives me a paragraph with hyphens on lines 1, 4, 5 and 6.

However, the standard LaTeX \showhyphens{} macro does not work to  
display hyphenation points when using OpenType fonts in XeTeX. This  
is the result of the different processing model and internal  
structures involved, and \showhyphens is something of a "hack". See  
<http://www.tug.org/pipermail/xetex/2006-September/005176.html> for  
an alternative, more xetex-friendly version.


HTH ... JK




More information about the XeTeX mailing list