[XeTeX] wrong uccode of ß
jonathan_kew at sil.org
Mon Apr 14 11:28:43 CEST 2008
On 14 Apr 2008, at 10:08 am, Ulrike Fischer wrote:
> in the following document I get in both cases the result "SCHLIEßEN"
> instead of the expected "SCHLIESSEN". (Uppercase of ß is SS, I don't
> know if Unicode somehow distinguish between a SS with lowercase ß
> and a
> SS with lowercase ss).
No, I don't think so. Unicode does now include a capital ß:
1E9E;LATIN CAPITAL LETTER SHARP S;Lu;0;L;;;;;N;;;;00DF;
with corresponding lowercase U+00DF. However, it does not define U
+1E9E as the uppercase of U+00DF, as this is only used in quite
limited contexts; it's not a standard mapping used in most texts.
> It doesn't matter if I load ngerman or not. As far as I can see the
> uccode is set in unicode-letters.tex.
Yes, it's set there based on the character properties in the standard
UnicodeData.txt file. This provides no uppercase mapping for U+00DF,
as it doesn't have a single-character uppercase equivalent.
To make ß become SS within \MakeUppercase, it will have to be handled
as a macro at some level (as is done, I expect, by standard LaTeX/
Babel etc); the primitive \uppercase operation simply uses the
\uccode values, and those are single character codes, so they cannot
map a character to a string. This sounds like something we should
implement in one of the macro packages (xunicode? xltxtra?
polyglossia?), but I haven't really looked into it.
More information about the XeTeX