[tex-hyphen] How and where to generate language.dat.lua?

Stephan Hennig mailing_list at arcor.de
Tue May 4 21:49:44 CEST 2010


Am 03.05.2010 03:34, schrieb Manuel Pégourié-Gonnard:


> Another possibility is to handle language.dat.lua in the same way we handle
> language.{dat,def} in TL currently. It would only require new (optional)
> attributes for the AddHyphen postaction, and the code to handle it of course.
> Pro: more modular and scalable. Con: needs coding.

Another approach:  What about language.dat.lua being a database of files 
containing the actual language entries grouped by language?  That is, 
split language.dat.lua into several language specific 
language-<code>.lua files containing the declarations in the format 
proposed by you and one language.dat.lua file that binds the 
language-<code>.lua files together.


-- language.dat.lua
return {
   ["german"]="language-de.lua",
   ["english"]="language-en.lua",
   ["french"]="language-fr.lua",
   ...
}


-- language-de.lua
return {
	["german"]={
		loader="loadhyph-de-1901.tex",
		patterns="hyph-de-1901.pat.txt",
		hyphenation="hyph-de-1901.hyp.txt",
		lefthyphenmin=2,
		righthyphenmin=2,
		synonyms={},
	},
	["ngerman"]={
		loader="loadhyph-de-1996.tex",
		patterns="hyph-de-1996.pat.txt",
		hyphenation="hyph-de-1996.hyp.txt",
		lefthyphenmin=2,
		righthyphenmin=2,
		synonyms={},
	}
}


pro:

   + Grouping.  All patterns concerning one language are grouped
     together.  For a pattern developer, who is typically interested in
     one language only, there would be no alien entries in the way,
     i.e., a file language-<code>.lua would be more easy to maintain
     than the current language.dat(.lua) monster.

   + Local modifications for a single language would less likely lead to
     side-effects for other languages.  A package with experimental
     patterns (dehyph-exptl as a use-case) could just provide a local
     file language-de.lua with some entries modified or added and there
     is no chance to breake other languages.  Less stress for maintainers
     and users who are ignorant about everything else than mother tongue.

   + This approach scales well with future non-standard hyphenation
     patterns, that can be declared in the same language-<code>.lua files
     without scaring people not speaking that language.

   + language-<code>.lua files could as well cover several languages,
     i.e., <code> doesn't necessarily need to be a language code, but
     could be more descriptive.  This could perhaps be useful for
     Cyrillic or Arabic languages (don't really know).


contra:

   + I haven't though about that.

   + More files, of course.

Best regards,
Stephan Hennig


More information about the tex-hyphen mailing list