[latex3-commits] [git/LaTeX3-latex3-babel] master: Marathi (frenchspacing). Some transforms and Uyghur improved. (ef83ac4)
Javier
email at dante.de
Tue Apr 6 17:56:53 CEST 2021
Repository : https://github.com/latex3/babel
On branch : master
Link : https://github.com/latex3/babel/commit/ef83ac40fc5d9f32de65917e5927b34387bc7b0b
>---------------------------------------------------------------
commit ef83ac40fc5d9f32de65917e5927b34387bc7b0b
Author: Javier <email at localhost>
Date: Tue Apr 6 17:56:53 2021 +0200
Marathi (frenchspacing). Some transforms and Uyghur improved.
>---------------------------------------------------------------
ef83ac40fc5d9f32de65917e5927b34387bc7b0b
README.md | 7 +-
babel.dtx | 107 ++++++++++++---------
babel.ins | 2 +-
babel.pdf | Bin 825254 -> 826418 bytes
bbcompat.dtx | 2 +-
locale/hi/babel-hi.ini | 2 +-
locale/hu/babel-hu.ini | 21 ++--
.../hu/{babel-hungarian.tex => babel-magyar.tex} | 0
locale/mr/babel-mr.ini | 2 +-
locale/ug/babel-uyghur.tex | 36 ++++---
news-guides/guides/keys-in-ini-files.md | 99 +++++++++++++++++--
news-guides/news/whats-new-in-babel-3.56.md | 12 +--
news-guides/news/whats-new-in-babel-3.57.md | 5 +-
13 files changed, 205 insertions(+), 90 deletions(-)
diff --git a/README.md b/README.md
index 94073a8..c661212 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-## Babel 3.56.2332
+## Babel 3.56.2334
This package manages culturally-determined typographical (and other)
rules, and hyphenation patterns for a wide range of languages. Many
@@ -47,14 +47,15 @@ respective authors.
### Summary of Latest changes
```
3.57 2021-04-08??
- * Transforms:
+ * Predefined transforms (lua):
- Arabic: transliteration.dad
- Croatian: digraphs.ligatures
- Greek: diaeresis.hyphen
- Hindi: transliteration.hk
- Hungarian: digraphs.hyphen
- * {xxxx} syntax also in string=.
+ * Transforms: \babel{xxxx} syntax also in string=.
* Preliminary code for Uyghur hyphenation (lua).
+ * magyar as alternative to hungarian in \babelprovide.
3.56 2021-03-24
* Transforms (\babelprehyphenation, \babelposthyphenation)
diff --git a/babel.dtx b/babel.dtx
index 94fe9b1..1d4b510 100644
--- a/babel.dtx
+++ b/babel.dtx
@@ -31,7 +31,7 @@
%
% \iffalse
%<*filedriver>
-\ProvidesFile{babel.dtx}[2021/04/04 v3.56.2332 The Babel package]
+\ProvidesFile{babel.dtx}[2021/04/06 v3.56.2334 The Babel package]
\documentclass{ltxdoc}
\GetFileInfo{babel.dtx}
\usepackage{fontspec}
@@ -3018,7 +3018,54 @@ but not the same, as those in Unicode.}
It currently embraces |\babelprehyphenation| and
|\babelposthyphenation|, which have been available for several months.
-\New{3.56} In this version they can be defined in |ini| files, too.
+
+\New{3.57} Several \textsf{ini} files predefine some transforms. They
+are activated with the key |transforms| in |\babelprovide|, either if
+the locale is being defined with it or the languages has been previouly
+loaded as a class or package option, as the following example
+illustrates:
+\begin{verbatim}
+ \usepackage[magyar]{babel}
+ \babelprovide[_transforms = digraphs.hyphen_]{magyar}
+\end{verbatim}
+
+Here are the transforms currently predefined. (More to follow
+in future releases.)
+
+\begingroup
+\def\trans#1#2#3{%
+ \vspace{1mm}%
+ \parbox[t]{2.4cm}{\strut#1}%
+ \parbox[t]{4.2cm}{\strut\ttfamily#2}%
+ \parbox[t]{6.6cm}{\strut#3}\par}
+\bigskip\hrule\nobreak\vspace{1mm}
+% \strut\hfil Transforms
+% \medskip\hrule\nobreak
+
+\trans{Arabic}{transliteration.dad}{Applies the transliteration system
+devised by Yannis Haralambous for \textsf{dad} (simple and
+\TeX-friendly). Not yet complete, but sufficient for most texts.}
+
+\trans{Croatian}{digraphs.ligatures}{Ligatures \textit{DŽ}, \textit{Dž},
+\textit{dž}, \textit{LJ}, \textit{Lj}, \textit{lj}, \textit{NJ},
+\textit{Nj}, \textit{nj}. It assumes they exist. This is not the
+recommended way to make these transformations (the best way is with
+OTF features), but it can get you out of a hurry.}
+
+\trans{Greek}{diaeresis.hyphen}{Removes the diaeresis above iota and
+upsilon if hyphenated just before. It works with the
+three variants.}
+
+\trans{Hindi}{transliteration.hk}{The Harvard-Kyoto system to romanize
+Devanagari.}
+
+\trans{Hungarian}{digraphs.hyphen}{Hyphenates the long digraphs
+\textit{ccs}, \textit{ddz}, \textit{ggy}, \textit{lly}, \textit{nny},
+\textit{ssz}, \textit{tty} and \textit{zzs} as \textit{cs-cs},
+\textit{dz-dz}, etc.}
+
+\vspace{2mm}\hrule\nobreak
+\endgroup
\Describe{\babelposthyphenation}{\marg{hyphenrules-name}%
\marg{lua-pattern}\marg{replacement}}
@@ -3062,10 +3109,11 @@ future implementation may alternatively accept \textsf{lpeg}.
\marg{lua-pattern}\marg{replacement}}
\New{3.44-3-52} It is similar to the latter, but (as its name implies)
-applied before hyphenation. There are other differences: (1) the first
-argument is the locale instead the name of hyphenation patterns; (2) in
-the search patterns |=| has no special meaning, while \verb+|+ stands
-for an ordinary space; (3) in the replacement, discretionaries are not
+applied before hyphenation, which is particularly useful in
+transliterations. There are other differences: (1) the first argument
+is the locale instead of the name of the hyphenation patterns; (2) in the
+search patterns |=| has no special meaning, while \verb+|+ stands for
+an ordinary space; (3) in the replacement, discretionaries are not
accepted.
It handles glyphs and spaces.
@@ -3092,48 +3140,13 @@ This feature is activated with the first |\babelposthyphenation| or
end of a line:
\begin{verbatim}
\babelprehyphenation{english}{|a|}
- {}, {}, % Keep first space and a
- {insert, penalty = 10000}, % Insert penalty
- {} % Keep last space
+ {}, {}, % Keep first space and a
+ { insert, penalty = 10000 }, % Insert penalty
+ {} % Keep last space
}
\end{verbatim}
\end{example}
-\begingroup
-\def\trans#1#2#3{%
- \vspace{1mm}%
- \parbox[t]{2.5cm}{\strut#1}%
- \parbox[t]{4cm}{\strut\ttfamily#2}%
- \parbox[t]{6cm}{\strut#3}\par}
-\bigskip\hrule\nobreak\medskip
-\strut Transforms
-\medskip\hrule\nobreak
-
-\trans{Arabic}{transliteration.dad}{Applies the transliteration system
-devised by Yannis Haralambous for \textsf{dad}. Not yet complete, but
-sufficient for many texts.}
-
-\trans{Croatian}{digraphs.ligatures}{Ligatures \textit{DŽ}, \textit{Dž},
-\textit{dž}, \textit{LJ}, \textit{Lj}, \textit{lj}, \textit{NJ},
-\textit{Nj}, \textit{nj}. It assumes they exist. This is not the
-recommended way to make these transformations (the best way is with
-OTF features), but it can get you out of a hurry.}
-
-\trans{Greek}{diaeresis.hyphen}{Removes the diaeresis above iota and
-upsilon if hyphenated just before. It works with the
-three variants.}
-
-\trans{Hindi}{transliteration.hk}{The Harvard-Kyoto system to romanize
-Devanagari.}
-
-\trans{Hungarian}{digraphs.hyphen}{Hyphenates the groups
-\textit{ccs}, \textit{ddz}, \textit{ggy}, \textit{lly}, \textit{nny},
-\textit{ssz}, \textit{tty} and \textit{zzs} as \textit{cs-cs},
-\textit{dz-dz}, etc.}
-
-\vspace{1mm}\hrule\nobreak
-\endgroup
-
\subsection{Selection based on BCP 47 tags}
\label{bcp47}
@@ -4897,8 +4910,8 @@ help from Bernd Raichle, for which I am grateful.
% \section{Tools}
%
% \begin{macrocode}
-%<<version=3.56.2332>>
-%<<date=2021/04/04>>
+%<<version=3.56.2334>>
+%<<date=2021/04/06>>
% \end{macrocode}
%
% \textbf{Do not use the following macros in \texttt{ldf} files. They
@@ -13396,7 +13409,7 @@ help from Bernd Raichle, for which I am grateful.
Babel.linebreaking = Babel.linebreaking or {}
Babel.linebreaking.before = {}
Babel.linebreaking.after = {}
- Babel.locale = {} % Free to use, indexed with \localeid
+ Babel.locale = {} % Free to use, indexed by \localeid
function Babel.linebreaking.add_before(func)
tex.print([[\noexpand\csname bbl at luahyphenate\endcsname]])
table.insert(Babel.linebreaking.before, func)
diff --git a/babel.ins b/babel.ins
index 94faf75..81860a2 100644
--- a/babel.ins
+++ b/babel.ins
@@ -26,7 +26,7 @@
%% and covered by LPPL is defined by the unpacking scripts (with
%% extension .ins) which are part of the distribution.
%%
-\def\filedate{2021/04/04}
+\def\filedate{2021/04/06}
\def\batchfile{babel.ins}
\input docstrip.tex
diff --git a/babel.pdf b/babel.pdf
index f8b6d5c..48c8543 100644
Binary files a/babel.pdf and b/babel.pdf differ
diff --git a/bbcompat.dtx b/bbcompat.dtx
index 2d699ea..3dadceb 100644
--- a/bbcompat.dtx
+++ b/bbcompat.dtx
@@ -30,7 +30,7 @@
%
% \iffalse
%<*dtx>
-\ProvidesFile{bbcompat.dtx}[2021/04/04 v3.56.2332]
+\ProvidesFile{bbcompat.dtx}[2021/04/06 v3.56.2334]
%</dtx>
%
%% File 'bbcompat.dtx'
diff --git a/locale/hi/babel-hi.ini b/locale/hi/babel-hi.ini
index a99f3c2..7deaf5f 100644
--- a/locale/hi/babel-hi.ini
+++ b/locale/hi/babel-hi.ini
@@ -238,7 +238,7 @@ transliteration.hk.9.2 = { string = ^^^^094d{1} }
transliteration.hk.10.0 = { [{0915}-{0939}]([{0915}-{0939}]) }
transliteration.hk.10.1 = {}
transliteration.hk.10.2 = { string = ^^^^094d{1} }
-; Implicit a
+; Inherent a
transliteration.hk.11.0 = { [{0915}-{0939}]{0905} }
transliteration.hk.11.1 = {}
transliteration.hk.11.2 = { remove }
diff --git a/locale/hu/babel-hu.ini b/locale/hu/babel-hu.ini
index c156e61..2039f40 100644
--- a/locale/hu/babel-hu.ini
+++ b/locale/hu/babel-hu.ini
@@ -194,10 +194,19 @@ superscriptingExponent = ×
[counters]
[transforms.posthyphenation]
-digraphs.hyphen.1.0 = { ()([cz])(){1}s }
-digraphs.hyphen.1.1 = { no = {1}, pre = {1}s- }
-digraphs.hyphen.2.0 = { ()([ds])(){1}z }
-digraphs.hyphen.2.1 = { no = {1}, pre = {1}z- }
-digraphs.hyphen.3.0 = { ()([glnt])(){1}y }
-digraphs.hyphen.3.1 = { no = {1}, pre = {1}y- }
+digraphs.hyphen.1.0 = { ([czCZ])|{1}([sS]) }
+digraphs.hyphen.1.1 = {}
+digraphs.hyphen.1.2 = { pre = {2}-, data = 1 }
+digraphs.hyphen.1.3 = {}
+digraphs.hyphen.1.4 = {}
+digraphs.hyphen.2.0 = { ([dsDS])|{1}([zZ]) }
+digraphs.hyphen.2.1 = {}
+digraphs.hyphen.2.2 = { pre = {2}-, data = 1 }
+digraphs.hyphen.2.3 = {}
+digraphs.hyphen.2.4 = {}
+digraphs.hyphen.3.0 = { ([glntGLNT])|{1}([yY]) }
+digraphs.hyphen.3.1 = {}
+digraphs.hyphen.3.2 = { pre = {2}-, data = 1 }
+digraphs.hyphen.3.3 = {}
+digraphs.hyphen.3.4 = {}
diff --git a/locale/hu/babel-hungarian.tex b/locale/hu/babel-magyar.tex
similarity index 100%
copy from locale/hu/babel-hungarian.tex
copy to locale/hu/babel-magyar.tex
diff --git a/locale/mr/babel-mr.ini b/locale/mr/babel-mr.ini
index 1178e56..55d9826 100644
--- a/locale/mr/babel-mr.ini
+++ b/locale/mr/babel-mr.ini
@@ -113,7 +113,7 @@ time.medium = [h]:[mm]:[ss] [a]
time.short = [h]:[mm] [a]
[typography]
-frenchspacing = yes
+frenchspacing = no
hyphenrules = marathi
lefthyphenmin = 2
righthyphenmin = 2
diff --git a/locale/ug/babel-uyghur.tex b/locale/ug/babel-uyghur.tex
index d3df2d2..d3ee2ea 100644
--- a/locale/ug/babel-uyghur.tex
+++ b/locale/ug/babel-uyghur.tex
@@ -11,49 +11,54 @@
}
\newattribute\bblug at disc
-\bblug at disc=0
+\bblug at disc=-1
\bbl at luahyphenate
-\directlua{
+% 1) Store discretionaries just after hyphenation as an attribute of the
+% next glyph, with the value of the disc penalty (assumed positive). Then
+% remove the discretionary.
+% 2) After the shaping, restore the discretionaries.
-Babel.uyghur = Babel.uyghur or {}
+\directlua{
+Babel.locale[\the\localeid] = {}
+local ug = Babel.locale[\the\localeid]
-function Babel.uyghur.posthyphen(head)
+function ug.posthyphen(head)
local UGDISC = luatexbase.registernumber'bblug at disc'
for item in node.traverse(head) do
if item.id == 7 and item.subtype == 3 and
item.next and item.next.id == 29 and
item.next.lang == \the\l at uyghur\space then
- node.set_attribute(item.next, UGDISC, 1)
+ node.set_attribute(item.next, UGDISC, item.penalty)
node.remove(head, item)
end
end
end
-Babel.uyghur.hyphen_sep = .09 % in em units
+ug.hyphen_sep = .09 % in em units
% Note it can be a string, with several characters:
-Babel.uyghur.hyphen = unicode.utf8.char(0x0640)
+ug.hyphen = unicode.utf8.char(0x0640)
-Babel.linebreaking.add_after(Babel.uyghur.posthyphen)
+Babel.linebreaking.add_after(ug.posthyphen)
-function Babel.uyghur.hyphenate(head)
+function ug.hyphenate(head)
local d, k
local quad = 655360
local UGDISC = luatexbase.registernumber'bblug at disc'
for item in node.traverse(head) do
if item.id == 29 and item.lang == \the\l at uyghur\space then
local ugdisc = node.get_attribute(item, UGDISC)
- if ugdisc > 0 then
+ if ugdisc >= 0 then
quad = font.getfont(item.font).size or quad
k = node.new(13, 1) % (kern, userkern)
- k.kern = Babel.uyghur.hyphen_sep * quad
+ k.kern = ug.hyphen_sep * quad
d = node.new(7, 3) % (disc, regular)
d.pre = Babel.str_to_nodes(
- function() return Babel.uyghur.hyphen end,
+ function() return ug.hyphen end,
nil, item)
d.pre = node.insert_before(d.pre, d.pre, k)
- d.penalty = 50 % Must be tex.(ex)hyphenpenalty
+ d.penalty = ugdisc
head = node.insert_before(head, item, d)
end
end
@@ -62,10 +67,9 @@ function Babel.uyghur.hyphenate(head)
end
luatexbase.add_to_callback("pre_linebreak_filter",
- Babel.uyghur.hyphenate, "Babel.uyghur.hyphenate")
+ ug.hyphenate, "Babel.locale.uyghur.hyphenate")
luatexbase.add_to_callback("hpack_filter",
- Babel.uyghur.hyphenate, "Babel.uyghur.hyphenate")
-
+ ug.hyphenate, "Babel.locale.uyghur.hyphenate")
}
\endinput
\ No newline at end of file
diff --git a/news-guides/guides/keys-in-ini-files.md b/news-guides/guides/keys-in-ini-files.md
index f2926a2..95b0cfa 100644
--- a/news-guides/guides/keys-in-ini-files.md
+++ b/news-guides/guides/keys-in-ini-files.md
@@ -3,6 +3,46 @@
(*Under development.*)
Many keys are related to the CLDR (Common Language Data Repository).
+Others are just the TeX primitives with the same name.
+
+### `identification`
+
+Most of them are self explanatory.
+
+**charset** The charset in the `ini` file (currently must be `utf8`).
+
+**tag.bcp47** May includes if appropriate language, script and region.
+ Usually only the language.
+
+**language.tag.bcp47** Th language part.
+
+**tag.bcp47.likely** The likely full tag. See [Likely Subtags (CLDR)](https://unicode-org.github.io/cldr-staging/charts/latest/supplemental/likely_subtags.html)
+
+**tag.opentype**
+
+**script.name**
+
+**script.tag.bcp47**
+
+**script.tag.opentype**
+
+**level** `ini`files are based on a set of keys. The level is much a
+ ‘version’ of the list of available keys. Currently is 1, and it will
+ stay so until there is some significant change.
+
+**derivate** Not yet used, but its purpose is to identify if the files
+ is the original one distributed with `babel` or a derivate (for
+ example, a publishing house may want to define its own files).
+
+**encodings** A mostly informative field for 8-bit engines requiring
+ font encodings (`T1`, `LGR`, etc.)
+
+### `captions`
+
+The `.licr` subsections are used in 8-bit engines. The final `name` is
+added by `babel`.
+
+### `date`
Here are some explanations for dates:
@@ -17,17 +57,64 @@ calendar, such as in English for days of the week:
> S M T W T F S
-About **exemplarCharacters**:
+### `typography`
-It can help to recognize a language. This list and the punctuation
-list are currently not used by `babel`.
+**frenchspacing** (`yes` or `no`) Enable or disable `\frenchspacing`
-About numbers:
+**hyphenrules** As named in `language.dat`.
-See
+**lefthyphenmin** `\lefthyphenmin`
+
+**righthyphenmin** `\righthyphenmin`
+
+**hyphenchar** The hyphenation char (number). Empty for the default. 0
+ if there is no hyphen (eg, Thai).
-http://cldr.unicode.org/translation/numbering-systems
+**prehyphenchar** Not yet used (`luatex`).
+
+**posthyphenchar** Not yet used (`luatex`).
+
+**exhyphenchar** Not yet used (`luatex`)
+
+**preexhyphenchar** Not yet used (`luatex`)
+
+**postexhyphenchar** Not yet used (`luatex`)
+
+**hyphenationmin** Not yet used (`luatex`), but it will be soon.
+
+**hyphenate.other.locale** (Tentative syntax.) A few hyphenation
+ patterns require setting some chars to `other`. This one is based on
+ the language.
+
+**hyphenate.other.script** (Tentative syntax.) Same, based on the
+ script.
+
+### `labels`
+
+Under development:
+
+https://github.com/latex3/babel/blob/master/news-guides/news/whats-new-in-babel-3.48.md
+
+### `characters`
+
+See the CLDR. For example [Exemplar
+Characters](http://cldr.unicode.org/translation/-core-data/exemplars),
+can help to recognize a language. This list and the punctuation list
+are currently not used by `babel`.
+
+### `numbers`
+
+See [Numering systems](http://cldr.unicode.org/translation/-core-data/numbering-systems)
The section about numbers may be used by some package to format
numbers (or even `babel` itself in a future). They reflect local tradicional
usage, not the international one set by either the SI or ISO 80000.
+
+### `counters`
+
+See https://tex.stackexchange.com/questions/529813/how-to-define-counters-with-arbitrary-alphabet/530491#530491
+
+### `transforms`
+
+See
+[What's new in babel 3.56](https://github.com/latex3/babel/blob/master/news-guides/news/whats-new-in-babel-3.56.md#transforms-in-ini-files)
diff --git a/news-guides/news/whats-new-in-babel-3.56.md b/news-guides/news/whats-new-in-babel-3.56.md
index 5c36591..efbf294 100644
--- a/news-guides/news/whats-new-in-babel-3.56.md
+++ b/news-guides/news/whats-new-in-babel-3.56.md
@@ -24,13 +24,13 @@ from with `data`:
\babelprehyphenation{french}{ «{a} }{
{},
{ insert, penalty = 10000 },
- { insert, space=.2 .05 0, data = 1 },
+ { insert, space= .2 .05 0, data = 1 },
{}
}
\babelprehyphenation{french}{ «|{a} }{
{},
{ insert, penalty = 10000 },
- { space=.2 .05 0, data = 1 },
+ { space= .2 .05 0, data = 1 },
{}
}
```
@@ -40,7 +40,7 @@ then matches):
```tex
\babelprehyphenation{french}{ «{a} }{
{},
- { insert, space=.2 .05 0, data = 1 },
+ { insert, space= .2 .05 0, data = 1 },
{}
}
```
@@ -52,7 +52,7 @@ word separation in the font.
\babelprehyphenation{french}{ «{a} }{
{},
{ insert, penalty = 10000 },
- { insert, spacefactor=.8 .3 .8, data = 1 },
+ { insert, spacefactor= .8 .3 .8, data = 1 },
{}
}
```
@@ -100,8 +100,8 @@ example, `%`). Just write the hex code with at least 4 ‘hex digits’.
For example, `{d}{0025}` matches a digit followed by a `%`.
Remember you can still enter characters with the old good `^^` syntax,
-which is converted at the TeX level; this extension is handled by lua
-directly, so catcodes are not relevant.
+which is converted at the TeX level; this `{}` extension is handled by
+lua directly, so catcodes are not relevant.
## Fixes
diff --git a/news-guides/news/whats-new-in-babel-3.57.md b/news-guides/news/whats-new-in-babel-3.57.md
index f5704ea..0077be4 100644
--- a/news-guides/news/whats-new-in-babel-3.57.md
+++ b/news-guides/news/whats-new-in-babel-3.57.md
@@ -7,8 +7,9 @@
*Some of them are still experimental or incomplete.*
* **Arabic** `transliteration.dad` ▸ Applies the transliteration system
-devised by Yannis Haralambous for \textsf{dad}. Not yet complete, but
-sufficient for many texts.
+ devised by Yannis Haralambous for
+ [`dad`](http://mirrors.ctan.org/language/arabic/dad/dad-user-guide.pdf).
+ Not yet complete, but sufficient for many texts.
* **Croatian** `digraphs.ligatures` ▸ Ligatures *DŽ*, *Dž*,
*dž*, *LJ*, *Lj*, *lj*, *NJ*,
More information about the latex3-commits
mailing list.