[luatex] setting lccode "automatically"

Philipp Stephani st_philipp at yahoo.de
Thu Jun 23 12:27:36 CEST 2011


Am 23.06.2011 um 11:46 schrieb Patrick Gundlach:

> Another question regarding hyphenation (thanks Paul and Taco for the answer to the first one).
> 
> Hyphenation is only done when the lccode of each char is not 0. Now most languages have chars beyond a-z, such as Ä or é or Л. Now how do I set these lccodes?
> 
> Currently I do something like:
> 
> for i in string.utfvalues("äÄöÖüÜß") do
>  tex.lccode[i] = i
> end
> 
> but this has two disadvantages I can see:
> 
> 1) I have to manually pick the foreign characters and set the lccode manually
> 2) What is the lowercase of I (LATIN CAPITAL LETTER I)? Is it i or ı (LATIN SMALL LETTER DOTLESS I)?
> 
> 
> I guess that I should use a unicode data table for the characters. But that is still not 100% correct for languages like Turkish and Azeri, right? Since the lccodes are not language-local, we cannot achieve a 100% correct solution, correct?

No, and even without those local-dependent cases, it would still be impossible to build a correct lccode/uccode table since lowercasing/uppercasing one character is context-dependent and can result in more than one character: the uppercase of ß is SS. \lccode/\uccode (and by extension, \lowercase/\uppercase) is just not usable in the Unicode world. LuaTeX might implement the casing algorithms (with tailoring) described in section 3.13 of the standard. This includes
- Locale-dependent mappings
- Context-dependent mappings
- Length-changing mappings


More information about the luatex mailing list