[luatex] setting lccode "automatically"

Taco Hoekwater taco at elvenkind.com
Thu Jun 23 12:00:11 CEST 2011

On 06/23/11 11:46, Patrick Gundlach wrote:
> Another question regarding hyphenation (thanks Paul and Taco for the answer to the first one).
> Hyphenation is only done when the lccode of each char is not 0. Now most languages have chars beyond a-z, such as Ä or é or Л. Now how do I set these lccodes?
> Currently I do something like:
> for i in string.utfvalues("äÄöÖüÜß") do
>   tex.lccode[i] = i
> end

Yes, pretty much. That is also what luatex-unicode-letters.tex does
(but it uses TeX syntax, because it was borrowed from XeTeX).

> but this has two disadvantages I can see:
> 1) I have to manually pick the foreign characters and set the lccode manually

Here luatex-unicode-letters can help.

> 2) What is the lowercase of I (LATIN CAPITAL LETTER I)? Is it i or ı (LATIN SMALL LETTER DOTLESS I)?

This problem is not fixable without knowing the current language ...

> I guess that I should use a unicode data table for the characters. But that is still not 100% correct for languages like Turkish and Azeri, right? Since the lccodes are not language-local, we cannot achieve a 100% correct solution, correct?

.. correct, you need a per-language table, which (as far as I know)
does not actually exist. That is what the \savinghyphcodes of etex was
for, but the implementation of that was unusable in luatex and
therefore removed. A new implementation of \savinghyphcodes will come

Best wishes,

More information about the luatex mailing list