[latex3-commits] [git/LaTeX3-latex3-babel] master: Marathi (frenchspacing). Some transforms and Uyghur improved. (ef83ac4)

Tue Apr 6 17:56:53 CEST 2021

Repository : https://github.com/latex3/babel
On branch  : master
Link       : https://github.com/latex3/babel/commit/ef83ac40fc5d9f32de65917e5927b34387bc7b0b

>---------------------------------------------------------------

commit ef83ac40fc5d9f32de65917e5927b34387bc7b0b
Author: Javier <email at localhost>
Date:   Tue Apr 6 17:56:53 2021 +0200

    Marathi (frenchspacing). Some transforms and Uyghur improved.


>---------------------------------------------------------------

ef83ac40fc5d9f32de65917e5927b34387bc7b0b
 README.md                                          |   7 +-
 babel.dtx                                          | 107 ++++++++++++---------
 babel.ins                                          |   2 +-
 babel.pdf                                          | Bin 825254 -> 826418 bytes
 bbcompat.dtx                                       |   2 +-
 locale/hi/babel-hi.ini                             |   2 +-
 locale/hu/babel-hu.ini                             |  21 ++--
 .../hu/{babel-hungarian.tex => babel-magyar.tex}   |   0
 locale/mr/babel-mr.ini                             |   2 +-
 locale/ug/babel-uyghur.tex                         |  36 ++++---
 news-guides/guides/keys-in-ini-files.md            |  99 +++++++++++++++++--
 news-guides/news/whats-new-in-babel-3.56.md        |  12 +--
 news-guides/news/whats-new-in-babel-3.57.md        |   5 +-
 13 files changed, 205 insertions(+), 90 deletions(-)

diff --git a/README.md b/README.md
index 94073a8..c661212 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-## Babel 3.56.2332
+## Babel 3.56.2334
 
 This package manages culturally-determined typographical (and other)
 rules, and hyphenation patterns for a wide range of languages. Many
@@ -47,14 +47,15 @@ respective authors.
 ### Summary of Latest changes
 ```
 3.57   2021-04-08??
-       * Transforms:
+       * Predefined transforms (lua):
          - Arabic:     transliteration.dad
          - Croatian:   digraphs.ligatures
          - Greek:      diaeresis.hyphen
          - Hindi:      transliteration.hk
          - Hungarian:  digraphs.hyphen
-       * {xxxx} syntax also in string=.
+       * Transforms: \babel{xxxx} syntax also in string=.
        * Preliminary code for Uyghur hyphenation (lua).
+       * magyar as alternative to hungarian in \babelprovide.
          
 3.56   2021-03-24
        * Transforms (\babelprehyphenation, \babelposthyphenation)
diff --git a/babel.dtx b/babel.dtx
index 94fe9b1..1d4b510 100644
--- a/babel.dtx
+++ b/babel.dtx
@@ -31,7 +31,7 @@
 %
 % \iffalse
 %<*filedriver>
-\ProvidesFile{babel.dtx}[2021/04/04 v3.56.2332 The Babel package]
+\ProvidesFile{babel.dtx}[2021/04/06 v3.56.2334 The Babel package]
 \documentclass{ltxdoc}
 \GetFileInfo{babel.dtx}
 \usepackage{fontspec}
@@ -3018,7 +3018,54 @@ but not the same, as those in Unicode.}
 
 It currently embraces |\babelprehyphenation| and
 |\babelposthyphenation|, which have been available for several months.
-\New{3.56} In this version they can be defined in |ini| files, too.
+
+\New{3.57} Several \textsf{ini} files predefine some transforms. They
+are activated with the key |transforms| in |\babelprovide|, either if
+the locale is being defined with it or the languages has been previouly
+loaded as a class or package option, as the following example
+illustrates:
+\begin{verbatim}
+  \usepackage[magyar]{babel}
+  \babelprovide[_transforms = digraphs.hyphen_]{magyar}
+\end{verbatim}
+
+Here are the transforms currently predefined. (More to follow 
+in future releases.)
+
+\begingroup
+\def\trans#1#2#3{%
+  \vspace{1mm}%
+  \parbox[t]{2.4cm}{\strut#1}%
+  \parbox[t]{4.2cm}{\strut\ttfamily#2}%
+  \parbox[t]{6.6cm}{\strut#3}\par}
+\bigskip\hrule\nobreak\vspace{1mm}
+% \strut\hfil Transforms 
+% \medskip\hrule\nobreak
+
+\trans{Arabic}{transliteration.dad}{Applies the transliteration system
+devised by Yannis Haralambous for \textsf{dad} (simple and
+\TeX-friendly). Not yet complete, but sufficient for most texts.}
+
+\trans{Croatian}{digraphs.ligatures}{Ligatures \textit{DŽ}, \textit{Dž},
+\textit{dž}, \textit{LJ}, \textit{Lj}, \textit{lj}, \textit{NJ},
+\textit{Nj}, \textit{nj}. It assumes they exist. This is not the
+recommended way to make these transformations (the best way is with
+OTF features), but it can get you out of a hurry.}
+
+\trans{Greek}{diaeresis.hyphen}{Removes the diaeresis above iota and
+upsilon if hyphenated just before. It works with the
+three variants.}
+
+\trans{Hindi}{transliteration.hk}{The Harvard-Kyoto system to romanize
+Devanagari.}
+
+\trans{Hungarian}{digraphs.hyphen}{Hyphenates the long digraphs
+\textit{ccs}, \textit{ddz}, \textit{ggy}, \textit{lly}, \textit{nny},
+\textit{ssz}, \textit{tty} and \textit{zzs} as \textit{cs-cs},
+\textit{dz-dz}, etc.}
+
+\vspace{2mm}\hrule\nobreak
+\endgroup
 
 \Describe{\babelposthyphenation}{\marg{hyphenrules-name}%
           \marg{lua-pattern}\marg{replacement}}
@@ -3062,10 +3109,11 @@ future implementation may alternatively accept \textsf{lpeg}.
           \marg{lua-pattern}\marg{replacement}}
 
 \New{3.44-3-52} It is similar to the latter, but (as its name implies)
-applied before hyphenation. There are other differences: (1) the first
-argument is the locale instead the name of hyphenation patterns; (2) in
-the search patterns |=| has no special meaning, while \verb+|+ stands
-for an ordinary space; (3) in the replacement, discretionaries are not
+applied before hyphenation, which is particularly useful in
+transliterations. There are other differences: (1) the first argument
+is the locale instead of the name of the hyphenation patterns; (2) in the
+search patterns |=| has no special meaning, while \verb+|+ stands for
+an ordinary space; (3) in the replacement, discretionaries are not
 accepted.
 
 It handles glyphs and spaces.
@@ -3092,48 +3140,13 @@ This feature is activated with the first |\babelposthyphenation| or
   end of a line:
 \begin{verbatim}
 \babelprehyphenation{english}{|a|}
-  {}, {},                     % Keep first space and a
-  {insert, penalty = 10000},  % Insert penalty
-  {}                          % Keep last space
+  {}, {},                       % Keep first space and a
+  { insert, penalty = 10000 },  % Insert penalty
+  {}                            % Keep last space
 }
 \end{verbatim}
 \end{example}
 
-\begingroup
-\def\trans#1#2#3{%
-  \vspace{1mm}%
-  \parbox[t]{2.5cm}{\strut#1}%
-  \parbox[t]{4cm}{\strut\ttfamily#2}%
-  \parbox[t]{6cm}{\strut#3}\par}
-\bigskip\hrule\nobreak\medskip
-\strut Transforms 
-\medskip\hrule\nobreak
-
-\trans{Arabic}{transliteration.dad}{Applies the transliteration system
-devised by Yannis Haralambous for \textsf{dad}. Not yet complete, but
-sufficient for many texts.}
-
-\trans{Croatian}{digraphs.ligatures}{Ligatures \textit{DŽ}, \textit{Dž},
-\textit{dž}, \textit{LJ}, \textit{Lj}, \textit{lj}, \textit{NJ},
-\textit{Nj}, \textit{nj}. It assumes they exist. This is not the
-recommended way to make these transformations (the best way is with
-OTF features), but it can get you out of a hurry.}
-
-\trans{Greek}{diaeresis.hyphen}{Removes the diaeresis above iota and
-upsilon if hyphenated just before. It works with the
-three variants.}
-
-\trans{Hindi}{transliteration.hk}{The Harvard-Kyoto system to romanize
-Devanagari.}
-
-\trans{Hungarian}{digraphs.hyphen}{Hyphenates the groups
-\textit{ccs}, \textit{ddz}, \textit{ggy}, \textit{lly}, \textit{nny},
-\textit{ssz}, \textit{tty} and \textit{zzs} as \textit{cs-cs},
-\textit{dz-dz}, etc.}
-
-\vspace{1mm}\hrule\nobreak
-\endgroup
-
 \subsection{Selection based on BCP 47 tags}
 \label{bcp47}
 
@@ -4897,8 +4910,8 @@ help from Bernd Raichle, for which I am grateful.
 % \section{Tools}
 %
 %    \begin{macrocode}
-%<<version=3.56.2332>>
-%<<date=2021/04/04>>
+%<<version=3.56.2334>>
+%<<date=2021/04/06>>
 %    \end{macrocode}
 %
 % \textbf{Do not use the following macros in \texttt{ldf} files. They
@@ -13396,7 +13409,7 @@ help from Bernd Raichle, for which I am grateful.
   Babel.linebreaking = Babel.linebreaking or {}
   Babel.linebreaking.before = {}
   Babel.linebreaking.after = {}
-  Babel.locale = {} % Free to use, indexed with \localeid
+  Babel.locale = {} % Free to use, indexed by \localeid
   function Babel.linebreaking.add_before(func)
     tex.print([[\noexpand\csname bbl at luahyphenate\endcsname]])
     table.insert(Babel.linebreaking.before, func)
diff --git a/babel.ins b/babel.ins
index 94faf75..81860a2 100644
--- a/babel.ins
+++ b/babel.ins
@@ -26,7 +26,7 @@
 %% and covered by LPPL is defined by the unpacking scripts (with
 %% extension .ins) which are part of the distribution.
 %%
-\def\filedate{2021/04/04}
+\def\filedate{2021/04/06}
 \def\batchfile{babel.ins}
 \input docstrip.tex
 
diff --git a/babel.pdf b/babel.pdf
index f8b6d5c..48c8543 100644
Binary files a/babel.pdf and b/babel.pdf differ
diff --git a/bbcompat.dtx b/bbcompat.dtx
index 2d699ea..3dadceb 100644
--- a/bbcompat.dtx
+++ b/bbcompat.dtx
@@ -30,7 +30,7 @@
 %
 % \iffalse
 %<*dtx>
-\ProvidesFile{bbcompat.dtx}[2021/04/04 v3.56.2332]
+\ProvidesFile{bbcompat.dtx}[2021/04/06 v3.56.2334]
 %</dtx>
 %
 %% File 'bbcompat.dtx'
diff --git a/locale/hi/babel-hi.ini b/locale/hi/babel-hi.ini
index a99f3c2..7deaf5f 100644
--- a/locale/hi/babel-hi.ini
+++ b/locale/hi/babel-hi.ini
@@ -238,7 +238,7 @@ transliteration.hk.9.2  =   { string = ^^^^094d{1} }
 transliteration.hk.10.0 = { [{0915}-{0939}]([{0915}-{0939}]) }
 transliteration.hk.10.1 =   {}
 transliteration.hk.10.2 =   { string = ^^^^094d{1} }
-; Implicit a
+; Inherent a
 transliteration.hk.11.0 = { [{0915}-{0939}]{0905} }
 transliteration.hk.11.1 =   {}
 transliteration.hk.11.2 =   { remove }
diff --git a/locale/hu/babel-hu.ini b/locale/hu/babel-hu.ini
index c156e61..2039f40 100644
--- a/locale/hu/babel-hu.ini
+++ b/locale/hu/babel-hu.ini
@@ -194,10 +194,19 @@ superscriptingExponent = ×
 [counters]
 
 [transforms.posthyphenation]
-digraphs.hyphen.1.0 = { ()([cz])(){1}s }
-digraphs.hyphen.1.1 = { no = {1}, pre = {1}s- }
-digraphs.hyphen.2.0 = { ()([ds])(){1}z }
-digraphs.hyphen.2.1 = { no = {1}, pre = {1}z- }
-digraphs.hyphen.3.0 = { ()([glnt])(){1}y }
-digraphs.hyphen.3.1 = { no = {1}, pre = {1}y- }
+digraphs.hyphen.1.0 = { ([czCZ])|{1}([sS]) }
+digraphs.hyphen.1.1 = {} 
+digraphs.hyphen.1.2 = { pre = {2}-, data = 1 }
+digraphs.hyphen.1.3 = {}
+digraphs.hyphen.1.4 = {}
+digraphs.hyphen.2.0 = { ([dsDS])|{1}([zZ]) }
+digraphs.hyphen.2.1 = {} 
+digraphs.hyphen.2.2 = { pre = {2}-, data = 1 }
+digraphs.hyphen.2.3 = {}
+digraphs.hyphen.2.4 = {}
+digraphs.hyphen.3.0 = { ([glntGLNT])|{1}([yY]) }
+digraphs.hyphen.3.1 = {} 
+digraphs.hyphen.3.2 = { pre = {2}-, data = 1 }
+digraphs.hyphen.3.3 = {}
+digraphs.hyphen.3.4 = {}
 
diff --git a/locale/hu/babel-hungarian.tex b/locale/hu/babel-magyar.tex
similarity index 100%
copy from locale/hu/babel-hungarian.tex
copy to locale/hu/babel-magyar.tex
diff --git a/locale/mr/babel-mr.ini b/locale/mr/babel-mr.ini
index 1178e56..55d9826 100644
--- a/locale/mr/babel-mr.ini
+++ b/locale/mr/babel-mr.ini
@@ -113,7 +113,7 @@ time.medium = [h]:[mm]:[ss] [a]
 time.short = [h]:[mm] [a]
 
 [typography]
-frenchspacing = yes
+frenchspacing = no
 hyphenrules = marathi
 lefthyphenmin = 2
 righthyphenmin = 2
diff --git a/locale/ug/babel-uyghur.tex b/locale/ug/babel-uyghur.tex
index d3df2d2..d3ee2ea 100644
--- a/locale/ug/babel-uyghur.tex
+++ b/locale/ug/babel-uyghur.tex
@@ -11,49 +11,54 @@
 }
 
 \newattribute\bblug at disc
-\bblug at disc=0
+\bblug at disc=-1
 
 \bbl at luahyphenate
 
-\directlua{
+% 1) Store discretionaries just after hyphenation as an attribute of the
+% next glyph, with the value of the disc penalty (assumed positive). Then
+% remove the discretionary. 
+% 2) After the shaping, restore the discretionaries.
 
-Babel.uyghur = Babel.uyghur or {}
+\directlua{
+Babel.locale[\the\localeid] = {}
+local ug = Babel.locale[\the\localeid]
 
-function Babel.uyghur.posthyphen(head)
+function ug.posthyphen(head)
   local UGDISC = luatexbase.registernumber'bblug at disc'
   for item in node.traverse(head) do
     if item.id == 7 and item.subtype == 3 and
         item.next and item.next.id == 29 and
         item.next.lang == \the\l at uyghur\space then 
-      node.set_attribute(item.next, UGDISC, 1)
+      node.set_attribute(item.next, UGDISC, item.penalty)
       node.remove(head, item)
     end
   end
 end
 
-Babel.uyghur.hyphen_sep = .09   % in em units
+ug.hyphen_sep = .09   % in em units
 % Note it can be a string, with several characters:
-Babel.uyghur.hyphen = unicode.utf8.char(0x0640)
+ug.hyphen = unicode.utf8.char(0x0640)
 
-Babel.linebreaking.add_after(Babel.uyghur.posthyphen)
+Babel.linebreaking.add_after(ug.posthyphen)
 
-function Babel.uyghur.hyphenate(head) 
+function ug.hyphenate(head) 
   local d, k
   local quad = 655360
   local UGDISC = luatexbase.registernumber'bblug at disc'
   for item in node.traverse(head) do
     if item.id == 29 and item.lang == \the\l at uyghur\space then
       local ugdisc = node.get_attribute(item, UGDISC)
-      if ugdisc > 0 then    
+      if ugdisc >= 0 then    
         quad = font.getfont(item.font).size or quad
         k = node.new(13, 1)  % (kern, userkern)
-        k.kern = Babel.uyghur.hyphen_sep * quad
+        k.kern = ug.hyphen_sep * quad
         d = node.new(7, 3)   % (disc, regular)
         d.pre = Babel.str_to_nodes(
-                      function() return Babel.uyghur.hyphen end, 
+                      function() return ug.hyphen end, 
                       nil, item)
         d.pre = node.insert_before(d.pre, d.pre, k)
-        d.penalty = 50 % Must be tex.(ex)hyphenpenalty
+        d.penalty = ugdisc
         head = node.insert_before(head, item, d)
       end
     end
@@ -62,10 +67,9 @@ function Babel.uyghur.hyphenate(head)
 end
 
 luatexbase.add_to_callback("pre_linebreak_filter",
-  Babel.uyghur.hyphenate, "Babel.uyghur.hyphenate")
+  ug.hyphenate, "Babel.locale.uyghur.hyphenate")
 luatexbase.add_to_callback("hpack_filter",
-  Babel.uyghur.hyphenate, "Babel.uyghur.hyphenate")
-  
+  ug.hyphenate, "Babel.locale.uyghur.hyphenate")
 }
 
 \endinput
\ No newline at end of file
diff --git a/news-guides/guides/keys-in-ini-files.md b/news-guides/guides/keys-in-ini-files.md
index f2926a2..95b0cfa 100644
--- a/news-guides/guides/keys-in-ini-files.md
+++ b/news-guides/guides/keys-in-ini-files.md
@@ -3,6 +3,46 @@
 (*Under development.*)
 
 Many keys are related to the CLDR (Common Language Data Repository).
+Others are just the TeX primitives with the same name.
+
+### `identification`
+
+Most of them are self explanatory.
+
+**charset** The charset in the `ini` file (currently must be `utf8`).
+
+**tag.bcp47** May includes if appropriate language, script and   region.
+  Usually only the language.   
+
+**language.tag.bcp47** Th language part.
+
+**tag.bcp47.likely** The likely full tag. See   [Likely Subtags (CLDR)](https://unicode-org.github.io/cldr-staging/charts/latest/supplemental/likely_subtags.html)
+
+**tag.opentype** 
+
+**script.name**
+
+**script.tag.bcp47**
+
+**script.tag.opentype**
+
+**level** `ini`files are based on a set of keys. The level is much a
+  ‘version’ of the list of available keys. Currently is 1, and it will
+  stay so until there is some significant change.
+
+**derivate** Not yet used, but its purpose is to identify if the files
+  is the original one distributed with `babel` or a derivate (for
+  example, a publishing house may want to define its own files).
+
+**encodings** A mostly informative field for 8-bit engines requiring
+  font encodings (`T1`, `LGR`, etc.)
+
+### `captions`
+
+The `.licr` subsections are used in 8-bit engines. The final `name` is
+added by `babel`.
+
+### `date`
 
 Here are some explanations for dates:
 
@@ -17,17 +57,64 @@ calendar, such as in English for days of the week:
 
 > S M T W T F S
 
-About **exemplarCharacters**:
+### `typography`
 
-It can help to recognize a language. This list and the punctuation
-list are currently not used by `babel`.
+**frenchspacing** (`yes` or `no`) Enable or disable `\frenchspacing`
 
-About numbers:
+**hyphenrules** As named in `language.dat`.
 
-See
+**lefthyphenmin** `\lefthyphenmin`
+
+**righthyphenmin** `\righthyphenmin`
+
+**hyphenchar** The hyphenation char (number). Empty for the default. 0
+  if there is no hyphen (eg, Thai).
 
-http://cldr.unicode.org/translation/numbering-systems 
+**prehyphenchar** Not yet used (`luatex`).
+
+**posthyphenchar** Not yet used (`luatex`).
+
+**exhyphenchar** Not yet used (`luatex`)
+
+**preexhyphenchar** Not yet used (`luatex`)
+
+**postexhyphenchar** Not yet used (`luatex`)
+
+**hyphenationmin**  Not yet used (`luatex`), but it will be soon.
+
+**hyphenate.other.locale** (Tentative syntax.) A few hyphenation
+  patterns require setting some chars to `other`. This one is based on
+  the language.
+
+**hyphenate.other.script** (Tentative syntax.) Same, based on the
+  script.
+  
+### `labels`
+
+Under development:
+
+https://github.com/latex3/babel/blob/master/news-guides/news/whats-new-in-babel-3.48.md
+
+### `characters`
+
+See the CLDR. For example [Exemplar
+Characters](http://cldr.unicode.org/translation/-core-data/exemplars),
+can help to recognize a language. This list and the punctuation list
+are currently not used by `babel`.
+
+### `numbers`
+
+See [Numering systems](http://cldr.unicode.org/translation/-core-data/numbering-systems)
 
 The section about numbers may be used by some package to format
 numbers (or even `babel` itself in a future). They reflect local tradicional
 usage, not the international one set by either the SI or ISO 80000.
+
+### `counters`
+
+See https://tex.stackexchange.com/questions/529813/how-to-define-counters-with-arbitrary-alphabet/530491#530491
+
+### `transforms`
+
+See
+[What's new in babel 3.56](https://github.com/latex3/babel/blob/master/news-guides/news/whats-new-in-babel-3.56.md#transforms-in-ini-files)
diff --git a/news-guides/news/whats-new-in-babel-3.56.md b/news-guides/news/whats-new-in-babel-3.56.md
index 5c36591..efbf294 100644
--- a/news-guides/news/whats-new-in-babel-3.56.md
+++ b/news-guides/news/whats-new-in-babel-3.56.md
@@ -24,13 +24,13 @@ from with `data`:
 \babelprehyphenation{french}{ «{a} }{
   {},
   { insert, penalty = 10000 }, 
-  { insert, space=.2 .05 0, data = 1 },
+  { insert, space= .2 .05 0, data = 1 },
   {}
 }
 \babelprehyphenation{french}{ «|{a} }{
   {},
   { insert, penalty = 10000 },
-  { space=.2 .05 0, data = 1 },
+  { space= .2 .05 0, data = 1 },
   {}
 }
 ```
@@ -40,7 +40,7 @@ then matches):
 ```tex
 \babelprehyphenation{french}{ «{a} }{
   {},
-  { insert, space=.2 .05 0, data = 1 },
+  { insert, space= .2 .05 0, data = 1 },
   {}
 }
 ```
@@ -52,7 +52,7 @@ word separation in the font.
 \babelprehyphenation{french}{ «{a} }{
   {}, 
   { insert, penalty = 10000 }, 
-  { insert, spacefactor=.8 .3 .8, data = 1 },
+  { insert, spacefactor= .8 .3 .8, data = 1 },
   {}
 }
 ```
@@ -100,8 +100,8 @@ example, `%`). Just write the hex code with at least 4 ‘hex digits’.
 For example, `{d}{0025}` matches a digit followed by a `%`.
 
 Remember you can still enter characters with the old good `^^` syntax,
-which is converted at the TeX level; this extension is handled by lua
-directly, so catcodes are not relevant.
+which is converted at the TeX level; this `{}` extension is handled by
+lua directly, so catcodes are not relevant.
 
 ## Fixes
 
diff --git a/news-guides/news/whats-new-in-babel-3.57.md b/news-guides/news/whats-new-in-babel-3.57.md
index f5704ea..0077be4 100644
--- a/news-guides/news/whats-new-in-babel-3.57.md
+++ b/news-guides/news/whats-new-in-babel-3.57.md
@@ -7,8 +7,9 @@
 *Some of them are still experimental or incomplete.*
 
 * **Arabic** `transliteration.dad` ▸ Applies the transliteration system
-devised by Yannis Haralambous for \textsf{dad}. Not yet complete, but
-sufficient for many texts.
+  devised by Yannis Haralambous for
+  [`dad`](http://mirrors.ctan.org/language/arabic/dad/dad-user-guide.pdf).
+  Not yet complete, but sufficient for many texts.
 
 * **Croatian** `digraphs.ligatures` ▸ Ligatures *DŽ*, *Dž*,
 *dž*, *LJ*, *Lj*, *lj*, *NJ*,