[OS X TeX] encoding and special characters in TexShop

Bruno Voisin bvoisin at mac.com
Sat Sep 16 09:21:57 CEST 2006

Le 15 sept. 06 à 20:02, Piet van Oostrum a écrit :

>> BV> - To make TeX understand your non-ASCII keystrokes, using the   
>> inputenc
>> BV> package with the appropriate option ([latin1] or [applemac]   
>> in most
>> BV> cases).
> That is just plain false. Every time you see this false statement  
> popping
> up. The input encoding has nothing to do with hyphenation.

See below. There was no need to be so harsh in your answer. Thanks  
for the enlightenment anyway.

>> BV> - To make TeX use fonts that include these non-ASCII  
>> characters  natively,
>> BV> either using the fontenc package with the [LY1] option, or   
>> the fontenc
>> BV> package with the [T1] option plus the textcomp package.  With  
>> Computer
>> BV> Modern fonts, that implies, in addition, either  installing  
>> the CM-Super
>> BV> fonts or using the lmodern package .
> That is right. To get hyphenation with accented characters you  
> neeed a font
> that contains these characters.

See below.

>> BV> - To make take use the proper hyphenation pattern, most likely  
>> by  using
>> BV> the babel package with the appropriate language option (here   
>> [german]).
> Right.
>> BV> As soon as an accented character is entered through control   
>> sequences, \"u
>> BV> say, you won't get proper hyphenation. Without the  babel  
>> package, any word
>> BV> containing \"u won't be hyphenated at all  (TeX won't consider  
>> it as a
>> BV> word). With the babel package, both  portions of the word  
>> before \"u and
>> BV> after \"u will be hyphenated  independently. But in order to  
>> get proper
>> BV> hyphenation, you really  needs all the steps above.
> False. If you use for example \usepackage[latin1]{fontenc}

??? Do you mean \usepackage[T1]{fontenc}, or \usepackage[latin1] 
{inputenc}. The first, I guess.

> (or one of the
> others with accented characters, LaTeX translates \"u to the single
> character ü in the proper font encoding. Therefore hyphenation works
> regardless whether you use \"u or ü in the input. The statement  
> about babel
> is only valid when you use [OT1] fontenconding.

Generally, do you mean hyphenation is performed:

- Directly on the keyboard input, here ü in ISO Latin 1 encoding?

- After conversion of this input to plain-TeX style control  
sequences, here \"u?

- After the conversion of these control sequences to characters in  
the output font, here (with \usepackage[T1]{fontenc}) ü in T1 encoding?

I thought the first answer was the right one. Your statement above  
(and Morten's message a bit earlier in this thread) seems to indicate  
the third answer is the right one. I must admit I'm surprised. It was  
my understanding that, as soon as TeX met a control sequence (\- 
something) in a "word", then it stopped considering this as a word  
and attempting to hyphenate it.

So, if I interpret your message and Morten's one correctly, they both  
mean that, to TeX, \"u is nothing more than a command. Without the  
fontenc package, it is translated into a composite of the character u  
plus an umlaut \accent primitive, prohibiting hyphenation. With the  
fontenc package and the [T1] or [LY1] option, it is translated into  
the character ü, allowing hyphenation. But, in any case, hyphenation  
isn't performed before this translation into glyphs of the output  
font, right?

And finally, where can one find information about these issues --  
other than the most arduous chapters of the TeXbook, I mean.

In addition, the fact that XeTeX requires for some languages modified  
hyphenation patterns, adapted to Unicode, seems difficult to fit  
within this picture. My poor head...

Bruno Voisin

------------------------- Info --------------------------
Mac-TeX Website: http://www.esm.psu.edu/mac-tex/
          & FAQ: http://latex.yauh.de/faq/
TeX FAQ: http://www.tex.ac.uk/faq
List Archive: http://tug.org/pipermail/macostex-archives/

More information about the macostex-archives mailing list