[tex-hyphen] tex patterns as lua files

Manuel Pégourié-Gonnard mpg at elzevir.fr
Tue Apr 27 13:06:58 CEST 2010

Le 27/04/2010 12:26, Mojca Miklavec a écrit :
> On Tue, Apr 27, 2010 at 11:36, Élie Roux wrote:
>> 2010/4/27 Mojca Miklavec wrote:
>>> There is now branches/luatex branch. Feel free to modify it as heavily
>>> as you wish.
>> Thank you! I'm going to Italy for one week tomorrow so if you don't
>> ear from me next week it's normal, but I should be available
>> afterwards.
> Me, Arthur, Manuel(?)

Unfortunately not.

> and others will be at a TeX conference from
> Friday to Monday. For us that's the perfect timing to do changes, but
> if you won't be available, that's still fine, we can do it later.
I just so happen I finally got started on this, so maybe I can try to help 
sorting things out while you're are BachoTeX.

> There's no reason for not putting the dtx file into that SVN repository.

> What I would really like to know before doing the change is:
> 1.) Which patterns should be default for any other program
> (javascript, perl etc.) outside of TeX?

I guess it's usenglishmax. The Knuthian version matters mainly in the 
nearly-frozen part of the TeX world.

> 2.) Do you need/want (two questions) Knuth's hyphen.tex patterns in
> "plain" format as well?
One could always special-case english since we're going to do it at some points 
anyway, but it would be a bit more easy for us if everything is uniform.

While we're at it, there's also a few other hyphenation files that are not in 
the normal form hypf-XX.tex + loadhyph-XX.tex + all the nice txt files you 
kindly prepared for us. Some are from hyphen-base, namely dumyhyph.tex and 
zerohyph.tex. Again, we can special-case them in our code, our you can provide 
.txt version of them (with an entry in languages.lua.dat) we would make a bit 
more work for you but would result in a cleaner code for loading.

It's mainly up to you to evaluate if you think those files belong to texhyphen 
or not. I don't mind doing the little additional Lua & TeX coding to treat them 
specially if needed. (Actually, I already know how I would do it for hyphen.cfg, 
and I didn't look too closely at etex.src yet but I know it's possible too.)

(There are also other files that end up being mentioned in TL's full 
language.dat but ae coming from other sources. We (meaning Élie and I) need to 
do something about that, but I propose postponing the discussion about them, 
since we're already dicsussing a lot of things at the same time).

>> or it can also be done on LaTeX's side, I can
>> modify the table accordingly. What would be the best?
> I'll respond once I know the answer to the two questions above. The
> table will be modified in either case and will include USenglish
> synonym. The question is only whether we should duplicate hyphen.tex
> in our repository and if yes, which patterns should take precedence
> (of having no -x-something extension). The lua table will be modified
> accordingly from languages.rb database.
IMO, for the rest of the world, usenglishmax is the canonical version for US 
english. I guess you want to reflect that in the code/filename by making it 
en-us, and Knuth patterns en-US-x-knuth-original.

What is sure is, the logical name "english" *must* be the knuthian patterns (= 
hyphen.tex = en-US-x-knuth-original), usenglish, USenglish and american have to 
be synonyms of this one, and the logical name "usenglishmax" needs to be 
ushyphmax.tex (ak en-US in the new codes if you follow my suggestion).

With current language.dat.lua, "english" points to en-US which is formerly 
ushyphmax, which means not Knuthian patterns, and that needs to be changed, 
regardless of what you decide for the rest.


More information about the tex-hyphen mailing list