[luatex] setting lccode "automatically"
Philipp Stephani
st_philipp at yahoo.de
Thu Jun 23 12:27:36 CEST 2011
Am 23.06.2011 um 11:46 schrieb Patrick Gundlach:
> Another question regarding hyphenation (thanks Paul and Taco for the answer to the first one).
>
> Hyphenation is only done when the lccode of each char is not 0. Now most languages have chars beyond a-z, such as Ä or é or Л. Now how do I set these lccodes?
>
> Currently I do something like:
>
> for i in string.utfvalues("äÄöÖüÜß") do
> tex.lccode[i] = i
> end
>
> but this has two disadvantages I can see:
>
> 1) I have to manually pick the foreign characters and set the lccode manually
> 2) What is the lowercase of I (LATIN CAPITAL LETTER I)? Is it i or ı (LATIN SMALL LETTER DOTLESS I)?
>
>
> I guess that I should use a unicode data table for the characters. But that is still not 100% correct for languages like Turkish and Azeri, right? Since the lccodes are not language-local, we cannot achieve a 100% correct solution, correct?
No, and even without those local-dependent cases, it would still be impossible to build a correct lccode/uccode table since lowercasing/uppercasing one character is context-dependent and can result in more than one character: the uppercase of ß is SS. \lccode/\uccode (and by extension, \lowercase/\uppercase) is just not usable in the Unicode world. LuaTeX might implement the casing algorithms (with tailoring) described in section 3.13 of the standard. This includes
- Locale-dependent mappings
- Context-dependent mappings
- Length-changing mappings
More information about the luatex
mailing list