[tex-hyphen] tex patterns as lua files

Mojca Miklavec mojca.miklavec.lists at gmail.com
Tue Apr 27 15:04:21 CEST 2010

On Tue, Apr 27, 2010 at 13:06, Manuel Pégourié-Gonnard wrote:
> Le 27/04/2010 12:26, Mojca Miklavec a écrit :
>> What I would really like to know before doing the change is:
>> 1.) Which patterns should be default for any other program
>> (javascript, perl etc.) outside of TeX?
> I guess it's usenglishmax. The Knuthian version matters mainly in the
> nearly-frozen part of the TeX world.

OK. If others agree ...

>> 2.) Do you need/want (two questions) Knuth's hyphen.tex patterns in
>> "plain" format as well?
> One could always special-case english since we're going to do it at some
> points anyway, but it would be a bit more easy for us if everything is
> uniform.
> While we're at it, there's also a few other hyphenation files that are not
> in the normal form hypf-XX.tex + loadhyph-XX.tex + all the nice txt files
> you kindly prepared for us. Some are from hyphen-base, namely dumyhyph.tex
> and zerohyph.tex. Again, we can special-case them in our code, our you can
> provide .txt version of them (with an entry in languages.lua.dat) we would
> make a bit more work for you but would result in a cleaner code for loading.
> It's mainly up to you to evaluate if you think those files belong to
> texhyphen or not. I don't mind doing the little additional Lua & TeX coding
> to treat them specially if needed. (Actually, I already know how I would do
> it for hyphen.cfg, and I didn't look too closely at etex.src yet but I know
> it's possible too.)

As far as dummy and zero are concerned, what do you think about the
idea of creating a separate folder with appropriate txt files for
those two languages? LuaTeX won't care about location and others that
might be willing to use the repository won't have to create special
cases for dummy/zero files in that folder.

Of course the entry for those two can be added to language.dat.lua.

As far as

> (There are also other files that end up being mentioned in TL's full
> language.dat but ae coming from other sources. We (meaning Élie and I) need
> to do something about that, but I propose postponing the discussion about
> them, since we're already dicsussing a lot of things at the same time).

- If you mean arabic and others, it's no problem to add an entry to
that lua file.
- If you mean ibycus, you probably don't want to support it in LuaTeX
- If you mean the Germans with their timestamped patterns, we may
postpone the discussion; in LuaTeX you would probably want to go for a
completly different route than the current approach anyway.
- There are also Javier's ideas about different subsets of patterns in
LuaTeX that we might want to consider.
- And there are some languages that have zillions of versions of
patterns (like Russians etc.).

Anything else?

>>> or it can also be done on LaTeX's side, I can
>>> modify the table accordingly. What would be the best?
>> I'll respond once I know the answer to the two questions above. The
>> table will be modified in either case and will include USenglish
>> synonym. The question is only whether we should duplicate hyphen.tex
>> in our repository and if yes, which patterns should take precedence
>> (of having no -x-something extension). The lua table will be modified
>> accordingly from languages.rb database.
> IMO, for the rest of the world, usenglishmax is the canonical version for US
> english. I guess you want to reflect that in the code/filename by making it
> en-us, and Knuth patterns en-US-x-knuth-original.
> What is sure is, the logical name "english" *must* be the knuthian patterns
> (= hyphen.tex = en-US-x-knuth-original), usenglish, USenglish and american
> have to be synonyms of this one, and the logical name "usenglishmax" needs
> to be ushyphmax.tex (ak en-US in the new codes if you follow my suggestion).
> With current language.dat.lua, "english" points to en-US which is formerly
> ushyphmax, which means not Knuthian patterns, and that needs to be changed,
> regardless of what you decide for the rest.

I fully agree with that. All I wanted to know was how to change that.


More information about the tex-hyphen mailing list