[XeTeX] wrong uccode of ß

Ulrike Fischer news2 at nililand.de
Mon Apr 14 11:59:41 CEST 2008


Am Mon, 14 Apr 2008 10:28:43 +0100 schrieb Jonathan Kew:

> On 14 Apr 2008, at 10:08 am, Ulrike Fischer wrote:
> 
>> Hello,
>>
>> in the following document I get in both cases the result "SCHLIEßEN"
>> instead of the expected "SCHLIESSEN". (Uppercase of ß is SS, I don't
>> know if Unicode somehow distinguish between a SS with lowercase ß  
>> and a
>> SS with lowercase ss).
> 
> No, I don't think so. Unicode does now include a capital ß:
> 
>    1E9E;LATIN CAPITAL LETTER SHARP S;Lu;0;L;;;;;N;;;;00DF;

How horrible ;-)

> 
> with corresponding lowercase U+00DF. However, it does not define U 
> +1E9E as the uppercase of U+00DF, as this is only used in quite  
> limited contexts; it's not a standard mapping used in most texts.
> 
>>
>> It doesn't matter if I load ngerman or not. As far as I can see the
>> uccode is set in unicode-letters.tex.
> 
> Yes, it's set there based on the character properties in the standard  
> UnicodeData.txt file. This provides no uppercase mapping for U+00DF,  
> as it doesn't have a single-character uppercase equivalent.
> 
> To make ß become SS within \MakeUppercase, it will have to be handled  
> as a macro at some level (as is done, I expect, by standard LaTeX/ 
> Babel etc); the primitive \uppercase operation simply uses the  
> \uccode values, and those are single character codes, so they cannot  
> map a character to a string. 

SS is a single char in T1 encoding (at position 223). So without
inputenc (which activates ß) \uppercase and \MakeUppercase works fine: 

\documentclass[12pt]{article}
\usepackage[ngerman]{babel}
%\usepackage[ansinew]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\begin{document}
ß\lowercase{ß}\uppercase{ß}

\char223 

\MakeLowercase{ß}\MakeUppercase{ß}
\end{document}

With inputenc things get a bit more complicated. Then \MakeUppercase
recurs to \@uclclist to map \ss to \SS.




-- 
Ulrike Fischer 



More information about the XeTeX mailing list