From elie.roux at telecom-bretagne.eu Fri Jan 20 09:24:55 2017 From: elie.roux at telecom-bretagne.eu (=?UTF-8?Q?=c3=89lie_Roux?=) Date: Fri, 20 Jan 2017 09:24:55 +0100 Subject: [tex-hyphen] Latin patterns update Message-ID: Dear All, We've made some significant update in the Latin liturgical patterns last year and things start to become quite stable. Can you please merge https://github.com/gregorio-project/hyphen-la/blob/master/patterns/hyph.la.liturgical.txt into tex-hyphen? I can make a pull request on hyphenation/tex-hyphen if it's easier for you. Thank you, -- Elie From arthur.reutenauer at normalesup.org Fri Jan 20 09:34:23 2017 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Fri, 20 Jan 2017 09:34:23 +0100 Subject: [tex-hyphen] Latin patterns update In-Reply-To: References: Message-ID: <20170120083423.GO147993@phare.normalesup.org> > I can make a pull request on hyphenation/tex-hyphen if it's easier for you. Yes, please. Best, Arthur From arthur.reutenauer at normalesup.org Thu Jan 26 13:51:56 2017 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Thu, 26 Jan 2017 13:51:56 +0100 Subject: [tex-hyphen] [tex-live] hyph-ru.tex is faulty. In-Reply-To: <19pyf3hziritz$.dlg@nililand.de> References: <1vasy7ulsftd1.dlg@nililand.de> <20170125221325.GB147993@phare.normalesup.org> <22665.12697.445240.778135@zaphod.ms25.net> <20170126105700.GG147993@phare.normalesup.org> <19pyf3hziritz$.dlg@nililand.de> Message-ID: <20170126125156.GH147993@phare.normalesup.org> (Moving discussion from the TeX Live list to TeX-hyphen, please reply there.) > Did you add patterns for all combining accents as I mentioned in one > of the comments? Not yet, but it?s on our list: https://github.com/hyphenation/tex-hyphen/issues/5 I need to figure out the best way to do it: should we input the full list of all combining characters for every language, or only those diacritic signs that are relevant for each language? The former option may seem like less work but we need to make sure that the accents don?t interact with the existing patterns (for all the languages), and ensure that it stays so in the future. If for example someone comes up with a pattern set for Russian that does take the combining acute accent into account, having a default list of patterns with accents may be self-defeating. Best, Arthur From d.p.carlisle at gmail.com Thu Jan 26 14:16:43 2017 From: d.p.carlisle at gmail.com (David Carlisle) Date: Thu, 26 Jan 2017 13:16:43 +0000 Subject: [tex-hyphen] [tex-live] hyph-ru.tex is faulty. In-Reply-To: <20170126125156.GH147993@phare.normalesup.org> References: <1vasy7ulsftd1.dlg@nililand.de> <20170125221325.GB147993@phare.normalesup.org> <22665.12697.445240.778135@zaphod.ms25.net> <20170126105700.GG147993@phare.normalesup.org> <19pyf3hziritz$.dlg@nililand.de> <20170126125156.GH147993@phare.normalesup.org> Message-ID: A related issue came up for latex as we set up the formats for 2017/01/01 release defaulting to Unicode encoding for the first time, should we default to NFC normalisation (using the xetex primitive and some lua callback) which would go some way to avoiding the need to deal with combining accents in the patterns? we didn't do that this time for fear of clashing with existing code but if this issue is going to keep coming up it might be good to look at this again... David On 26 January 2017 at 12:51, Arthur Reutenauer wrote: > (Moving discussion from the TeX Live list to TeX-hyphen, please reply > there.) > >> Did you add patterns for all combining accents as I mentioned in one >> of the comments? > > Not yet, but it?s on our list: https://github.com/hyphenation/tex-hyphen/issues/5 > I need to figure out the best way to do it: should we input the full > list of all combining characters for every language, or only those > diacritic signs that are relevant for each language? The former option > may seem like less work but we need to make sure that the accents don?t > interact with the existing patterns (for all the languages), and ensure > that it stays so in the future. If for example someone comes up with a > pattern set for Russian that does take the combining acute accent into > account, having a default list of patterns with accents may be > self-defeating. > > Best, > > Arthur From arthur.reutenauer at normalesup.org Thu Jan 26 15:08:47 2017 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Thu, 26 Jan 2017 15:08:47 +0100 Subject: [tex-hyphen] [tex-live] hyph-ru.tex is faulty. In-Reply-To: References: <1vasy7ulsftd1.dlg@nililand.de> <20170125221325.GB147993@phare.normalesup.org> <22665.12697.445240.778135@zaphod.ms25.net> <20170126105700.GG147993@phare.normalesup.org> <19pyf3hziritz$.dlg@nililand.de> <20170126125156.GH147993@phare.normalesup.org> Message-ID: <20170126140847.GI147993@phare.normalesup.org> > A related issue came up for latex as we set up the formats for > 2017/01/01 release defaulting to Unicode encoding for the first time, > should we default to NFC normalisation (using the xetex primitive and > some lua callback) which would go some way to > avoiding the need to deal with combining accents in the patterns? This won?t help in this case :-) There are no precomposed characters in Unicode for Cyrillic letters with acute accent, which is what we?d need in this instance (the acute accent is only used in very specific contexts in Russian to mark stress, for example in some dictionaries, and texts for teaching Russian as a second language, which was the use case discussed in November). > we didn't do that this time for fear of clashing with existing code > but if this issue is going to keep coming up it might be good to look > at this again... I?d really advocate against imposing NFC in the formats. Actually, on the long run I think NFD makes more sense, but that?s probably several more years down the line :-) Best, Arthur From d.p.carlisle at gmail.com Thu Jan 26 15:18:02 2017 From: d.p.carlisle at gmail.com (David Carlisle) Date: Thu, 26 Jan 2017 14:18:02 +0000 Subject: [tex-hyphen] [tex-live] hyph-ru.tex is faulty. In-Reply-To: <20170126140847.GI147993@phare.normalesup.org> References: <1vasy7ulsftd1.dlg@nililand.de> <20170125221325.GB147993@phare.normalesup.org> <22665.12697.445240.778135@zaphod.ms25.net> <20170126105700.GG147993@phare.normalesup.org> <19pyf3hziritz$.dlg@nililand.de> <20170126125156.GH147993@phare.normalesup.org> <20170126140847.GI147993@phare.normalesup.org> Message-ID: On 26 January 2017 at 14:08, Arthur Reutenauer wrote: >> A related issue came up for latex as we set up the formats for >> 2017/01/01 release defaulting to Unicode encoding for the first time, >> should we default to NFC normalisation (using the xetex primitive and >> some lua callback) which would go some way to >> avoiding the need to deal with combining accents in the patterns? > > This won?t help in this case :-) There are no precomposed characters > in Unicode for Cyrillic letters with acute accent, Which I knew at some point since Christmas as that was one argument I used against adding normalisation, that it only avoided the problem for some subset of languages.. But missed that just now, sorry:-) > which is what we?d > need in this instance (the acute accent is only used in very specific > contexts in Russian to mark stress, for example in some dictionaries, > and texts for teaching Russian as a second language, which was the use > case discussed in November). > >> we didn't do that this time for fear of clashing with existing code >> but if this issue is going to keep coming up it might be good to look >> at this again... > > I?d really advocate against imposing NFC in the formats. Actually, > on the long run I think NFD makes more sense, but that?s probably > several more years down the line :-) Yes NFD is in a way more consistent I'd agree. Anyway thanks for the confirmation that we are best not touching normalisation at this point > > Best, > > Arthur David From arthur.reutenauer at normalesup.org Thu Jan 26 15:52:26 2017 From: arthur.reutenauer at normalesup.org (Arthur Reutenauer) Date: Thu, 26 Jan 2017 15:52:26 +0100 Subject: [tex-hyphen] [tex-live] hyph-ru.tex is faulty. In-Reply-To: References: <1vasy7ulsftd1.dlg@nililand.de> <20170125221325.GB147993@phare.normalesup.org> <22665.12697.445240.778135@zaphod.ms25.net> <20170126105700.GG147993@phare.normalesup.org> <19pyf3hziritz$.dlg@nililand.de> <20170126125156.GH147993@phare.normalesup.org> <20170126140847.GI147993@phare.normalesup.org> Message-ID: <20170126145226.GK147993@phare.normalesup.org> > Which I knew at some point since Christmas as that was one argument > I used against adding normalisation, that it only avoided the problem > for some subset of languages.. > But missed that just now, sorry:-) That?s funny. > Yes NFD is in a way more consistent I'd agree. Anyway thanks for > the confirmation that we are best not touching normalisation at this point Yes please, don?t go down that route. Best, Arthur