From karl at freefriends.org Sun Oct 12 20:48:53 2008 From: karl at freefriends.org (Karl Berry) Date: Sun, 12 Oct 2008 20:48:53 +0200 Subject: [tex-hyphen] no hyphenmin for French? Message-ID: <200810121848.m9CImrA14726@tug.org> Arthur, Manuel, or anyone else with knowledge, Why no \lefthyphenmin and \righthyphenmin values for French in hyph-utf8? Can French hyphens really happen at one lettr- e ? Tried to research on web, sources, etc., no luck. Yours in puzzlement, Karl (working on setting up hyphenation in Texinfo) From arthur.reutenauer at normalesup.org Sun Oct 12 21:15:20 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Sun, 12 Oct 2008 21:15:20 +0200 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <200810121848.m9CImrA14726@tug.org> References: <200810121848.m9CImrA14726@tug.org> Message-ID: <20081012191520.GR29367@phare.normalesup.org> Hello Karl, > Why no \lefthyphenmin and \righthyphenmin values for French in > hyph-utf8? We didn't put hyphenmin values in hyph-utf8 for languages that did not already have some settings in their pattern files; it was the case for French, amongst many others. > Can French hyphens really happen at one lettr- > e ? You would of course not see that in general, although things like a- sym?trique wouldn't shock me, since a- is clearly a distinct part of the word on both phonological and morphological accounts (and even on the account of spelling, that would normally have "assym?trique", as intervocalic 's' are usually pronounced /z/, which is not the case here). I wouldn't be surprised if people used such hyphenations in some contexts, like newspapers in narrow columns, for example (I could sample issues of the local paper that I happen to have here, but I'm afraid you would find much more dubious typographic choices in the _T?l?gramme de Brest_ ;-) > Tried to research on web, sources, etc., no luck. One of the people I would ask about that would be Daniel Flipo, who contributed to the work on the original French patterns for TeX, twenty years ago. I already meant to ask him about the absence of any hyphenation exceptions in the pattern file, but never took the time to. Arthur From sojka at fi.muni.cz Sun Oct 12 21:40:09 2008 From: sojka at fi.muni.cz (Petr Sojka) Date: Sun, 12 Oct 2008 21:40:09 +0200 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <200810121848.m9CImrA14726@tug.org> References: <200810121848.m9CImrA14726@tug.org> Message-ID: <20081012194009.GC30118@fi.muni.cz> On Sun, Oct 12, 2008 at 08:48:53PM +0200, Karl Berry wrote: > Arthur, Manuel, or anyone else with knowledge, > > Why no \lefthyphenmin and \righthyphenmin values for French in > hyph-utf8? Can French hyphens really happen at one lettr- > e ? > > Tried to research on web, sources, etc., no luck. There are two settings for patterns (in general, not only for French): a) one is for pattern generation (patgen, etc.) b) one is for pattern use (for specific use) It has been shown that for a) it is best to use one for both hyphenmins, as it improves pattern generalization capabilities and does not restrict b). Setting for b) depends whether one typesets in narrow columns or not (e.g. for Czech one may even use (1,2) in extreme cases instead of suggested (2,3)). One may even set different values in one paragraph (e.g. (2,5) for the last word in a paragraph). Best Petr > Yours in puzzlement, > Karl (working on setting up hyphenation in Texinfo) > From mojca.miklavec.lists at gmail.com Tue Oct 14 15:27:12 2008 From: mojca.miklavec.lists at gmail.com (Mojca Miklavec) Date: Tue, 14 Oct 2008 15:27:12 +0200 Subject: [tex-hyphen] hyphenation patterns for URLs Message-ID: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> Hello, Does anyone have any idea about hyphenation rules for URLs? (Something like: do not ever break "http://", do not break before "/" or "." or "_" or "-", preferrably do not break right after "~", you may consider breaking at some language-preferred breaking points, but in general, breaking between any two letters should be OK - at least better than extending the line to infinity ...) Thanks, Mojca From wl at gnu.org Tue Oct 14 19:24:53 2008 From: wl at gnu.org (Werner LEMBERG) Date: Tue, 14 Oct 2008 19:24:53 +0200 (CEST) Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> References: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> Message-ID: <20081014.192453.146169277.wl@gnu.org> > Does anyone have any idea about hyphenation rules for URLs? > (Something like: do not ever break "http://", do not break before > "/" or "." or "_" or "-", preferrably do not break right after "~", > you may consider breaking at some language-preferred breaking > points, but in general, breaking between any two letters should be > OK - at least better than extending the line to infinity ...) I often see URLs broken like this: http:// aaa. bbb. ccc However, at least one US-American magazine (Sky & Telescope) uses http:// aaa .bbb .ccc which I consider better -- at first sight, it looks ugly, but since in normal sentences a full stop can never start a line, it is obvious that the URL is continued. BTW, does url.sty support this somehow? Werner From arthur.reutenauer at normalesup.org Tue Oct 14 21:16:13 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Tue, 14 Oct 2008 21:16:13 +0200 Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <20081014.192453.146169277.wl@gnu.org> References: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> <20081014.192453.146169277.wl@gnu.org> Message-ID: <20081014191613.GM29367@phare.normalesup.org> > However, at least one US-American magazine (Sky & Telescope) uses > > http:// > aaa > .bbb > .ccc That's what I would favour too, and for hyphens as well; in the latter case it's really important that it comes on the next line, since otherwise it could be mistaken for an inserted hyphen. Ironically, in the TUGboat article about hyph-utf8, I had to insert all the hyphens in URLs and file names by hand ;-) By the way, what about a new language with hyphenation patterns for URLs and file names? We can handle that in packages, but patterns are the most appropriate way to do this, after all. And it's really something of a different language :-) In LuaTeX, we can even do this on the fly. Arthur From wl at gnu.org Tue Oct 14 22:07:04 2008 From: wl at gnu.org (Werner LEMBERG) Date: Tue, 14 Oct 2008 22:07:04 +0200 (CEST) Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <20081014191613.GM29367@phare.normalesup.org> References: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> <20081014.192453.146169277.wl@gnu.org> <20081014191613.GM29367@phare.normalesup.org> Message-ID: <20081014.220704.162281788.wl@gnu.org> > By the way, what about a new language with hyphenation patterns for > URLs and file names? We need a zero-width, invisible hyphenation character in the font, and glyphs which look like `.' and `-' but have different character codes, say, `|' and `_', respectively, so that they can be used within the patterns (and both characters must be activated in the translation file for `patgen'). Similarly, we need glyphs for `0' to `9' with different character codes. Then the following hyphenation patterns a1| ... z1| <0>1| ... <9>1| ... a1_ ... z1_ <0>1_ ... <9>1_ ... /1a ... /1b /1<0> ... /1<9> ... (where is the character code for glyph `x') together with a virtual font for the character code mapping should do the trick. Werner From pragma at wxs.nl Tue Oct 14 23:16:24 2008 From: pragma at wxs.nl (Hans Hagen) Date: Tue, 14 Oct 2008 23:16:24 +0200 Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <20081014191613.GM29367@phare.normalesup.org> References: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> <20081014.192453.146169277.wl@gnu.org> <20081014191613.GM29367@phare.normalesup.org> Message-ID: <48F50C28.2010003@wxs.nl> Arthur Reutenauer wrote: > By the way, what about a new language with hyphenation patterns for > URLs and file names? We can handle that in packages, but patterns are > the most appropriate way to do this, after all. And it's really > something of a different language :-) i played with that a while ago but there are too many methods, so one would end up with several languages; also, there's the problem of letters versus others and such > In LuaTeX, we can even do this on the fly. in mkiv i use lua Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl ----------------------------------------------------------------- From karl at freefriends.org Wed Oct 15 00:47:35 2008 From: karl at freefriends.org (Karl Berry) Date: Tue, 14 Oct 2008 17:47:35 -0500 Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <20081014191613.GM29367@phare.normalesup.org> Message-ID: <200810142247.m9EMlZW17110@f7.net> werner> http:// > aaa > .bbb > .ccc I see the point, but I find it very disconcerting. Not that I mind if you want to do it that way for your own purposes. > BTW, does url.sty support this somehow? url.sty can do a lot, but I don't see a feasible way to do that. Perhaps ask Donald. arthur> Ironically, in the TUGboat article about hyph-utf8, I had to insert all the hyphens in URLs and file names by hand ;-) I don't understand. I try to rewrite everything in TUGboat so that urls, filenames, command names, etc., are never broken at a -, whether the - is part of the url/filename or not, to avoid the ambiguity. Did I miss one in your article? I don't see it. karl From arthur.reutenauer at normalesup.org Thu Oct 16 20:38:57 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Thu, 16 Oct 2008 20:38:57 +0200 Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <200810142247.m9EMlZW17110@f7.net> References: <20081014191613.GM29367@phare.normalesup.org> <200810142247.m9EMlZW17110@f7.net> Message-ID: <20081016183856.GQ29367@phare.normalesup.org> > Did I > miss one in your article? I don't see it. I don't see any either in the current version, but in the course of writing the article I saw many points where the line looked much better by breaking a URL or file name at a hyphen. And in those cases, I inserted an empty discretionary {\discretionary{}{}{}) *before* the hyphen, because it might be confusing otherwise. In the end, I decided to simply use macros for these generic names, and that's when I thought that patterns would really be more suited :-) Arthur From arthur.reutenauer at normalesup.org Thu Oct 16 20:39:30 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Thu, 16 Oct 2008 20:39:30 +0200 Subject: [tex-hyphen] hyphenation patterns for URLs In-Reply-To: <48F50C28.2010003@wxs.nl> References: <6faad9f00810140627q28045316q96140329dfecac53@mail.gmail.com> <20081014.192453.146169277.wl@gnu.org> <20081014191613.GM29367@phare.normalesup.org> <48F50C28.2010003@wxs.nl> Message-ID: <20081016183930.GR29367@phare.normalesup.org> >> In LuaTeX, we can even do this on the fly. > > in mkiv i use lua Yes, sure ... Arthur From mojca.miklavec.lists at gmail.com Fri Oct 17 00:56:38 2008 From: mojca.miklavec.lists at gmail.com (Mojca Miklavec) Date: Fri, 17 Oct 2008 00:56:38 +0200 Subject: [tex-hyphen] Lithuanian Message-ID: <6faad9f00810161556s2367e890jd9380a3df704edaf@mail.gmail.com> Hello, while browsing through http://wiki.services.openoffice.org/wiki/Dictionaries I have found an interesting thing: Language: Lithuanian (lt_LT) Origin: TeX hyphenation tables by Sigitas Tolusis and Vytas Statulevicius. The original tables can be found at http://www.vtex.lt/tex/download/zip/texmf.zip as lthyphen.tex. Author: Converted to OOo format by Albertas Agejevas License: LaTeX Project Public Licence They seem to be using two different encodings (there might be more, like cp1257, but that's probably almost equal to Latin7): lithuanian latin7lt.tex %% Latin 7 lithuaniantex lthyphen.tex %% TeX LT (modified T1) It's no problem to add those patterns to hyph-utf8, but: - in 8-bit engines it probably makes no sense without proper support (all the other files) - in XeTeX it might be useful, but one still needs at least ldf files or polyglossia Another problem with 8-bit engines is the same one as with Russian/Mongolian/Ukrainian ... multiple encodings, solved in yet another way. They have two languages defined (like in Mongolian), but they are using the same patterns from two sources (duplicated and slightly modified). I remember some user asking for support in ConTeXt once, but he disappeared. Apart from that I don't know anything. Maybe everyone is using VTeX/littex (http://www.vtex.lt/tex/littex/)? Any other reason why these packages are not on CTAN? I'll try to contact the authors. Anothe From arthur.reutenauer at normalesup.org Tue Oct 21 22:37:49 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Tue, 21 Oct 2008 22:37:49 +0200 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <200810121848.m9CImrA14726@tug.org> References: <200810121848.m9CImrA14726@tug.org> Message-ID: <20081021203749.GA14806@phare.normalesup.org> Hello Karl, > Why no \lefthyphenmin and \righthyphenmin values for French in > hyph-utf8? Can French hyphens really happen at one lettr- > e ? Just to settle the matter, the late Bernard Gaulle's french package sets a rather standard 2/3 default ... don't know what you've finally chosen in TeXinfo. In the GUTenberg private archives, I even found 15-year old e-mails where Bernard mentioned he tried \lefthyphenmin=3 for a while, but wasn't satisfied with the result. Arthur From karl at freefriends.org Wed Oct 22 01:54:19 2008 From: karl at freefriends.org (Karl Berry) Date: Tue, 21 Oct 2008 18:54:19 -0500 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <20081021203749.GA14806@phare.normalesup.org> Message-ID: <200810212354.m9LNsJa26512@f7.net> Just to settle the matter, the late Bernard Gaulle's french package sets a rather standard 2/3 default ... Thanks. I've made Texinfo do that (txi-fr.tex) and changed hyphen-french.tlpsrc for TeX Live. I suggest putting the information in the hyph-utf8 file(s), too. 0/0 isn't useful. Thanks, karl From mojca.miklavec.lists at gmail.com Wed Oct 22 02:49:14 2008 From: mojca.miklavec.lists at gmail.com (Mojca Miklavec) Date: Wed, 22 Oct 2008 02:49:14 +0200 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <200810212354.m9LNsJa26512@f7.net> References: <20081021203749.GA14806@phare.normalesup.org> <200810212354.m9LNsJa26512@f7.net> Message-ID: <6faad9f00810211749r293c6816lf6dd74b919f99cc5@mail.gmail.com> On Wed, Oct 22, 2008 at 1:54 AM, Karl Berry wrote: > Just to settle the matter, the late Bernard Gaulle's french package > sets a rather standard 2/3 default ... > > Thanks. I've made Texinfo do that (txi-fr.tex) and changed > hyphen-french.tlpsrc for TeX Live. > > I suggest putting the information in the hyph-utf8 file(s), too. > 0/0 isn't useful. I understood that 1/1 is also possible from someone's (maybe Arthur's) earlier posts. No limits have been set in the previous version to which we wanted to remain compatible, but I'll fix that. Mojca From arthur.reutenauer at normalesup.org Wed Oct 22 20:25:58 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Wed, 22 Oct 2008 20:25:58 +0200 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <6faad9f00810211749r293c6816lf6dd74b919f99cc5@mail.gmail.com> References: <20081021203749.GA14806@phare.normalesup.org> <200810212354.m9LNsJa26512@f7.net> <6faad9f00810211749r293c6816lf6dd74b919f99cc5@mail.gmail.com> Message-ID: <20081022182557.GD29367@phare.normalesup.org> > I understood that 1/1 is also possible from someone's (maybe Arthur's) > earlier posts. Yes (I wouldn't be too sure for 1/1, though, but you definitely see 2/2). As Taco explained a lot of times, it depends on the type of document you set, but that doesn't prevent from recommending default values. We should explain how the user / system installer can override them, though. Arthur From arthur.reutenauer at normalesup.org Wed Oct 22 20:30:08 2008 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Wed, 22 Oct 2008 20:30:08 +0200 Subject: [tex-hyphen] no hyphenmin for French? In-Reply-To: <200810212354.m9LNsJa26512@f7.net> References: <20081021203749.GA14806@phare.normalesup.org> <200810212354.m9LNsJa26512@f7.net> Message-ID: <20081022183008.GE29367@phare.normalesup.org> > Thanks. I've made Texinfo do that (txi-fr.tex) and changed > hyphen-french.tlpsrc for TeX Live. Note that the french package (!= Babel's frenchb) has given rise to a lot of controversy, but in that case I don't think there can be too much discussion. Arthur