[tex-hyphen] web interface for SVN

Mojca Miklavec mojca.miklavec.lists at gmail.com
Thu Jun 12 23:22:10 CEST 2008


On Thu, Jun 12, 2008 at 5:55 PM, Karl Berry wrote:
>
> However, all the .dat files already exist.  (When new patterns show up
> on CTAN, I create them as needed.)  Unless you have some deep desire to
> do so, I see no need for you to maintain or distribute them, or for them
> to be on CTAN.  (I don't know how MiKTeX does this stuff.)

OK. I will let you do that then, but if you need, I can generate those files.

>    to be concatenated together into a single language.dat?
>
> The concatenation is done at installation (or, now, update) time, since
> that is when we know which languages have been selected.
>
>
> As we've discussed, what TL does need is the TeX source file which the
> .dat points to (the top level file that does any engine switching).  For
> example (as you know), language.de.dat now says:
>
> german          xu-dehypht.tex
> ngerman         xu-dehyphn.tex
>
> Looking at your repo, I guess that will now be loadhyph-de-1901.tex and
> loadhyph-de-1996.tex, respectively?  Fine.

Yes. Unless you have some better suggestion for the file names

> More thoughts:
>
> - I suggest using conv-utf8- instead of conv_utf8_ (dashes instead of
>  underscores), just for consistency.

OK, done. I have also deleted texnansi since it was not needed.

One more question: what is a better name - ec or t1?
Should utf8 become utf-8 maybe?

I have removed texnansi - it turned out that it's not really needed,
but since it was generated, it was no additional work.

I have "invented" encoding name "il3" (iso-8859-3 = ISO Latin 3). I
have no idea how esperanto patterns are being used, but I only added
the needed letters to it. If anyone says we need to add more, I will
extend the file.

> - tex/patterns/pat and tex/patterns/readme (at least) do not seem like
>  they should end up in the live tex tree (texmf-dist/tex), but rather
>  in /source or /doc.  Shouldn't they?

None of the mentioned files are needed (not under doc and not anywhere
else). I will probably remove them completely.

I was (and still am) waiting for some feedback from Jonathan. I left
the files there in case that some "authority" :) changes his/her mind
and that we decide to use clean one-per-line-and-no-comments pat/hyp
files. I decided to use a single file for the moment the following
reasons:
- probably easier to talk authors into using those files than to
create & maintain three files
- less chance for confusion
- easier macros (only \input needed, no fancy trickery behind)
- Taco's desire about luatex: apart from two languages, luatex could
now still read patterns with a simple algorithm:
  - ignore comments
  - whitespace signs a new pattern (could be either space or newline
or more of them)
  - there's only \hyphenation and \patterns
and I still consider them simple and fast enough
- we can autogenerate those files extremely easily - the dummy part
was to get rid of all other macros and conversion conventions

> - in general, now would be a good time to build the tds-layout tree as we
>  want to actually install it.  I suggest the package name "texhyphen".
>  Thus,
>
> tex/generic/texhyphen/...
> source/generic/texhyphen/...
> doc/generic/texhyphen/...
>
>  If you can come up with a first shot at it, I'd be glad to review it
>  (in fact I'd like to) before uploading to CTAN and TL.

I agree with both Hans and Arthur that some sign of unicode might make
sense. All patterns are for TeX, so we do not neccessary need the word
TeX, but utf-8 is a helpful hint.

I would suggest one of the following:
- hyph-utf-8, hyph-utf8, hyphutf8
- utf-8-hyph, utf8-hyph, utf8hyph, utf-8-hyphen, utf8-hyphen, utf8hyphen

Maybe my favorite being hyph-utf8 or hyph-utf-8. (I really like the
dash in utf-8, but some may argue against it.)

As soon as the name is agreed on, I will change the structure and
generating scripts.

We need the following:
- loadhyph/loadhyph-foo.tex (unless someone suggests a better name)
- conv-utf8/conv-utf8-foo.tex
- patterns/utf/hyph-foo.tex
- pattern-loader.tex

If you have any name & structure suggestion, let me know - I will move
it and adapt what needs to be adapted in scripts.

> - I see the loadhyph files say they are generated.  In that case, I
> request adding a copyright line to the generation:
>
>  % Copyright 2008 TeX Users Group.
>  % You may freely use, modify and/or distribute this file.

Added.

Actually, I also need to add a note to all the other converted files.
What should I put on top of those files? We need to tell:
- who & when converted the file
- a note saying: "please do not add any TeX macros to this file"
-

A stupid/tiny detail - we tried to get rid of TeX macros in pattern
files. Do we want \endinput at the end or not?

> (This is the minimal "license" statement acceptable to all ...)
> Such a license statement / copyright line should be in
> pattern-loader.tex and all other source files, too.

pattern-loader.tex is written manually. Waiting for Jonathan's
comments. If we decide to keep .tex pattern files instead of
one-pattern-per-line, then 90% of what's in there may be removed. Or
even more radicas - we may put the contents of pattern-loader.tex to
loadhyph.tex. pattern-loader.tex is now loaded for every language (20
times), but if we keep .tex files, it's really really simple, so
simple that it might make sense to get rid of it. Waiting for
Jonathan's opinion.

> Of course, if you want to put your names or whatever instead of TUG,
> that's fine too.  Whatever.

Maybe we need some contact email?

Mojca


More information about the tex-hyphen mailing list