[latex3-commits] [git/LaTeX3-latex3-babel] master: New - \babelcharproperty. CJK line breaking disabled in verbatim. (12ea267)

Javier jbezos at dante.de
Mon May 13 17:12:59 CEST 2019


Repository : https://github.com/latex3/babel
On branch  : master
Link       : https://github.com/latex3/babel/commit/12ea2679345e87fa78077d4a75362afba6a4c593

>---------------------------------------------------------------

commit 12ea2679345e87fa78077d4a75362afba6a4c593
Author: Javier <jbezos at localhost>
Date:   Mon May 13 17:12:59 2019 +0200

    New - \babelcharproperty. CJK line breaking disabled in verbatim.
    
    nil.ldf creates its own empty \language.


>---------------------------------------------------------------

12ea2679345e87fa78077d4a75362afba6a4c593
 README.md    |   22 ++++---
 babel.dtx    |  203 ++++++++++++++++++++++++++++++++++++++++------------------
 babel.ins    |    2 +-
 babel.pdf    |  Bin 704485 -> 708107 bytes
 bbcompat.dtx |    2 +-
 5 files changed, 154 insertions(+), 75 deletions(-)

diff --git a/README.md b/README.md
index 3a038dc..c862160 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-## Babel 3.31
+## Babel 3.31.1640
 
 This package manages culturally-determined typographical (and other)
 rules, and hyphenation patterns for a wide range of languages.  Many
@@ -51,20 +51,24 @@ respective authors.
 ### Latest changes
 
 ```
+3.32   0000-00-00
+       - CJK line breaking is now disabled in verbatim (lua).
+       - New - \babelcharproperty, to change the direction, mirroring
+         glyph and line break properties (lua).
 3.31   2019-05-04
-       - Basic support for line breaking with CJK scripts.
+       - Basic support for line breaking with CJK scripts (lua)
        - layout=tabular now works with the 'array' package (and some
-         others).
+         others; lua).
 
 3.30   2019-04-22
-       - Fix - dir in boxes inside math (hopefully now it works).
-       - Option mapdigits for \babelprovide (only luatex), which
-         converts European digits to local ones.
+       - Fix - dir in boxes inside math (hopefully now it works; lua).
+       - Option mapdigits for \babelprovide, which converts European
+         digits to local ones (lua).
 
 3.29    2019-04-03
        - The fix for boxes inside math is incompatible with ams.
          Removed (a better fix is under study).
-       - Options bidi-l and bidi-r (for the bidi package).
+       - Options bidi-l and bidi-r (for the bidi package; xe).
 
 3.28    2019-04-01
        - Fixes - wrong dir after math, in math inside tabular, in weak L
@@ -85,7 +89,7 @@ respective authors.
 
 3.25   2018-10-03
        - Fixes for 3.23 - mapfont=direction could raise an error.
-         Language and Script were not always defined correctly.
+       - Language and Script were not always defined correctly.
        - Improved tentative support for Thai, Lao and Khmer in both 
          luatex and xetex.
 
@@ -143,4 +147,4 @@ respective authors.
 ```
 
 Javier Bezos
-2019/05/04
+2019/05/13
diff --git a/babel.dtx b/babel.dtx
index c078450..051436a 100644
--- a/babel.dtx
+++ b/babel.dtx
@@ -31,7 +31,7 @@
 %
 % \iffalse
 %<*filedriver>
-\ProvidesFile{babel.dtx}[2019/05/04 v3.31 The Babel package]
+\ProvidesFile{babel.dtx}[2019/05/13 v3.31.1640 The Babel package]
 \documentclass{ltxdoc}
 \GetFileInfo{babel.dtx}
 \usepackage{fontspec}
@@ -91,7 +91,7 @@
 \def\verbatim{\begin{shaded*}\bblxv\vskip-\baselineskip\vskip2.5\parsep}
 \def\endverbatim{\bblexv\vskip-2\baselineskip\end{shaded*}}
 \catcode`\_=\active
-\def_{\bgroup\let_\egroup\color{thered}}
+\def_{\bgroup\let_\egroup\leavevmode\color{thered}}
 \def\MacroFont{\fontencoding \encodingdefault \fontfamily\ttdefault
   \fontseries\mddefault \fontshape\updefault \small \catcode`\_=\active}
 \definecolor{shadecolor}{rgb}{0.96,0.96,0.93}
@@ -1212,7 +1212,7 @@ for auxiliary tasks).
   patterns for the latter in \luatex{}. Some quick patterns could help,
   with something similar to:
 \begingroup
-\setmonofont[Script=Lao]{DejaVu Sans Mono}
+\setmonofont[Script=Lao,Scale=MatchLowercase]{DejaVu Sans Mono}
 \begin{verbatim}
 \babelprovide[import,hyphenrules=+]{lao}
 \babelpatterns[lao]{1ດ 1ມ 1ແ 1ອ 1ງ 1ກ 1າ} % Random
@@ -1223,10 +1223,16 @@ for auxiliary tasks).
   language names must be sorted out, so you may need to set them
   explicitly in |\babelfont|, as well as |CJKShape|. \luatex{} does
   basic line breaking, but currently \xetex{} does not (you may load
-  \textsf{zhspacing}). Anyway, CJK texts are are best set with a
+  \textsf{zhspacing}). Although for a few words and shorts texts the
+  |ini| files should be fine, CJK texts are are best set with a
   dedicated framework (\textsf{CJK}, \textsf{luatexja}, \textsf{kotex},
-  \textsf{CTeX}...), although for a few words and shorts texts \babel{}
-  should be fine.
+  \textsf{CTeX}...), . Actually, this is what the |ldf| does in
+  |japanese| with \luatex, because the following piece of code loads 
+  \textsf{luatexja}:
+\begin{verbatim}
+\documentclass{ltjbook}
+\usepackage[japanese]{babel}
+\end{verbatim}
 \end{description}
 \end{note}
 
@@ -2370,7 +2376,10 @@ font encodings are the same, like in Unicode based engines.
 
 \New{3.31} (Only \luatex.) With |\babelprovide| and |import|ed CJK
 languages, a simple generic line breaking algorithm (push-out-first) is
-applied, based on a selection of the Unicode rules.
+applied, based on a selection of the Unicode rules (\New{3.31} it is
+disabled in verbatim mode, or more precisely when the hyphenrules
+are set to |nohyphenation|). It can be activated alternatively by
+setting explicitly the |intraspace|.
 
 \New{3.27} Interword spacing for Thai, Lao and Khemer is activated
 automatically if a language with one of those scripts are loaded with
@@ -2440,11 +2449,11 @@ differ in the way `weak' numeric characters are ordered (eg, Arabic
   essentially stable, but, of course, it is not bug free and there
   could be improvements in the future, because setting bidi text has
   many subtleties (see for example <https://www.w3.org/TR/html-bidi/>).
-  A basic stable version for other engines must wait very likely until
-  (Northern) Winter. This applies to text, but \textbf{graphical}
-  elements, including the |picture| environment and PDF or PS based
-  graphics, are not yet correctly handled (far from trivial). Also,
-  indexes and the like are under study, as well as math.
+  A basic stable version for other engines must wait. This applies to
+  text, but \textbf{graphical} elements, including the |picture|
+  environment and PDF or PS based graphics, are not yet correctly
+  handled (far from trivial). Also, indexes and the like are under
+  study, as well as math (there are progresses in the latter).
 
   An effort is being made to avoid incompatibilities in the future
   (this one of the reason currently bidi must be explicitly requested
@@ -2482,11 +2491,7 @@ See particularly |lua-bidibasic.tex| and |lua-secenum.tex|.
   The following text comes from the Arabic Wikipedia (article about
   Arabia). Copy-pasting some text from the Wikipedia is a good way to
   test this feature. Remember |basic-r| is available in \luatex{}
-  only.\footnote{At the time of this writing some Arabic fonts are not
-  rendered correctly by the default \luatex{} font loader, with
-  misplaced kerns inside some words, so double check the resulting
-  text. Have a look at the workaround available on GitHub, under
-  \texttt{/required/babel/samples}}
+  only.
   \begingroup
 % If you are looking at the code to see how it has been written, you
 % will be disappointed :-). The following example is built ad hoc to
@@ -2531,25 +2536,25 @@ _\babelprovide[import, main]{arabic}_
   \setmonofont[Scale=.87,Script=Arabic]{DejaVu Sans Mono} \catcode`@=13
   \def@#1{\ifcase#1\relax \egroup \or \bgroup\textdir TRT \else
   \bgroup\textdir TLT \pardir TLT \fi}
-  \begin{verbatim}
-  \documentclass{book}
+\begin{verbatim}
+\documentclass{book}
 
-  \usepackage[english, _bidi=basic_]{babel}
+\usepackage[english, _bidi=basic_]{babel}
 
-  \babelprovide[_mapfont=direction_]{arabic}
+\babelprovide[_mapfont=direction_]{arabic}
 
-  \babelfont{rm}{Crimson}
-  \babelfont[*arabic]{rm}{FreeSerif}
+\babelfont{rm}{Crimson}
+\babelfont[*arabic]{rm}{FreeSerif}
 
-  \begin{document}
+\begin{document}
 
-  Most Arabic speakers consider the two varieties to be two registers
-  of one language, although the two registers can be referred to in
-  Arabic as @1فصحى العصر@0 \textit{fuṣḥā l-ʻaṣr} (MSA) and
-  @1فصحى التراث@0 \textit{fuṣḥā t-turāth} (CA).
+Most Arabic speakers consider the two varieties to be two registers
+of one language, although the two registers can be referred to in
+Arabic as @1فصحى العصر@0 \textit{fuṣḥā l-ʻaṣr} (MSA) and
+ at 1فصحى التراث@0 \textit{fuṣḥā t-turāth} (CA).
 
-  \end{document}
-  \end{verbatim}
+\end{document}
+\end{verbatim}
   \endgroup
   In this example, and thanks to |mapfont=direction|, any Arabic letter
   (because the language is |arabic|) changes its font to that set for
@@ -2565,9 +2570,9 @@ _\babelprovide[import, main]{arabic}_
   |\hbox|’es). If you need |\ref| ranges, the best option is to define
   a dedicated macro like this (to avoid explicit direction changes in the
   body; here |\texthe| must be defined to select the main language):
-  \begin{verbatim}
-  \newcommand\refrange[2]{\babelsublr{\texthe{\ref{#1}}-\texthe{\ref{#2}}}}
-  \end{verbatim}
+\begin{verbatim}
+\newcommand\refrange[2]{\babelsublr{\texthe{\ref{#1}}-\texthe{\ref{#2}}}}
+\end{verbatim}
 
   In a future a more complete method, reading recursively boxed text, may
   be added.
@@ -2768,7 +2773,7 @@ options are also used (eg, |\ProsodicMarksOn| in \textsf{latin}).
 events. Some hooks are predefined when \luatex{} and \xetex{} are
 used.
 
-\Describe\AddBabelHook{\marg{name}\marg{event}\marg{code}}
+\Describe{\AddBabelHook}{\marg{name}\marg{event}\marg{code}}
 
 The same name can be applied to several events.  Hooks may be enabled
 and disabled for all defined events with
@@ -2841,7 +2846,7 @@ ones, they only have a single hook and replace a default definition.
   file. Used by \file{luababel.def}.
 \end{description}
 
-\Describe\BabelContentsFiles{}
+\Describe{\BabelContentsFiles}{}
 \New{3.9a} This macro contains a list of ``toc'' types
 requiring a command to switch the language. Its default value is
 |toc,lof,lot|, but you may redefine it with |\renewcommand| (it's up
@@ -2910,8 +2915,9 @@ ibygreek, bgreek, serbianc, frenchle, ethiop} and \textsf{friulan}.
 
 Most of them work out of the box, but some may require extra fonts,
 encoding files, a preprocessor or even a complete framework (like
-CJK).  For example, if you have got the \textsf{velthuis/devnag} package,
-you can create a file with extension |.dn|:
+\textsf{CJK} or \textsf{luatexja}). For example, if you have got the
+\textsf{velthuis/devnag} package, you can create a file with extension
+|.dn|:
 \begin{verbatim}
 \documentclass{article}
 \usepackage[hindi]{babel}
@@ -2922,11 +2928,31 @@ you can create a file with extension |.dn|:
 Then you preprocess it with |devnag| \m{file}, which creates
 \m{file}|.tex|; you can then typeset the latter with \LaTeX.
 
-\begin{note}
-  Please, for info about the support in luatex for some complex scripts,
-  see the wiki, on \texttt{https://github.com/latex3/latex2e/wiki/%
-  Babel:-Remarks-on-the-luatex-support-for-some-scripts}.
-\end{note}
+\subsection{Unicode character properties in \luatex}
+
+Part of the \babel{} job is to apply Unicode rules to some
+script-specific features based on some properties. Currently, they are
+3, namely, direction (ie, bidi class), mirroring glyphs, and line
+breaking for CJK scripts. These properties are stored in \textsf{lua}
+tables, which you can modify with the following macro.
+
+\Describe{\babelcharproperty}{\marg{char-code}\oarg{to-char-code}%
+          \marg{propertry}\marg{value}}
+
+\New{3.32} Here, \marg{char-code} is a number (with \TeX{} syntax).
+With the optional argument, you can set a range of values. There are
+three properties (with a short name, taken from Unicode): |direction|
+(|bc|), |mirror| (|bmg|), |linebreak| (|lb|).
+          
+For example:
+\begin{verbatim}
+\babelcharproperty{`¿}{mirror}{`?}   
+\babelcharproperty{`-}{direction}{l}  % or al, r, en, an, on, et, cs
+\babelcharproperty{`)}{linebreak}{cl} % or id, op, cl, ns, ex, in, hy
+\end{verbatim}
+
+This command is allowed only in vertical mode (the preamble or between
+paragraphs).
 
 \subsection{Tips, workarounds, know issues and notes}
 
@@ -3997,8 +4023,8 @@ help from Bernd Raichle, for which I am grateful.
 % \section{Tools}
 %
 %    \begin{macrocode}
-%<<version=3.31>>
-%<<date=2019/05/04>>
+%<<version=3.31.1640>>
+%<<date=2019/05/13>>
 %    \end{macrocode}
 %
 % \textbf{Do not use the following macros in \texttt{ldf} files. They
@@ -5332,6 +5358,10 @@ help from Bernd Raichle, for which I am grateful.
   \def\<bbl at e@#2>{\the\toks@{\bbl at ens@fontenc}}}}
 \def\bbl at ensure#1#2#3{% 1: include 2: exclude 3: fontenc
   \def\bbl at tempb##1{% elt for (excluding) \bbl at captionslist list
+    \ifx##1\@undefined
+      \edef##1{\noexpand\bbl at nocaption
+        {\bbl at stripslash##1}{\languagename\bbl at stripslash##1}}%
+    \fi
     \ifx##1\@empty\else
       \in@{##1}{#2}%
       \ifin@\else
@@ -8008,11 +8038,14 @@ help from Bernd Raichle, for which I am grateful.
     \fi
     \bbl at exp{\\\bbl at add\\\bbl at mapselect{\\\bbl at mapdir{\languagename}}}%
   \fi
-  % For Southeast Asian, if interspace in ini -- TODO: as hook
+  % For East Asian, Southeast Asian, if interspace in ini - TODO: as hook?
+  \ifx\bbl at KVP@intraspace\@nil\else % We may override the ini
+    \bbl at csarg\edef{intsp@#2}{\bbl at KVP@intraspace}%
+  \fi
   \ifcase\bbl at engine\or
     \bbl at ifunset{bbl at intsp@\languagename}{}%
       {\expandafter\ifx\csname bbl at intsp@\languagename\endcsname\@empty\else
-         \bbl at xin@{\bbl at cs{sbcp@\languagename}}{Hant,Hans,Jpan,Kore}%
+         \bbl at xin@{\bbl at cs{sbcp@\languagename}}{Hant,Hans,Jpan,Kore,Kana}%
          \ifin@
            \bbl at cjkintraspace
            \directlua{
@@ -8020,19 +8053,13 @@ help from Bernd Raichle, for which I am grateful.
                Babel.locale_props = Babel.locale_props or {}
                Babel.locale_props[\the\localeid].linebreak = 'c'
            }%
-           \ifx\bbl at KVP@intraspace\@nil
-              \bbl at exp{%
-                \\\bbl at intraspace\bbl at cs{intsp@\languagename}\\\@@}%
-           \fi
+           \bbl at exp{\\\bbl at intraspace\bbl at cs{intsp@\languagename}\\\@@}%
            \ifx\bbl at KVP@intrapenalty\@nil
              \bbl at intrapenalty0\@@
            \fi 
          \else
            \bbl at seaintraspace
-           \ifx\bbl at KVP@intraspace\@nil
-              \bbl at exp{%
-                \\\bbl at intraspace\bbl at cs{intsp@\languagename}\\\@@}%
-           \fi
+           \bbl at exp{\\\bbl at intraspace\bbl at cs{intsp@\languagename}\\\@@}%
            \directlua{
               Babel = Babel or {}
               Babel.sea_ranges = Babel.sea_ranges or {}
@@ -8044,9 +8071,6 @@ help from Bernd Raichle, for which I am grateful.
            \fi
          \fi
        \fi
-       \ifx\bbl at KVP@intraspace\@nil\else % We may override the ini
-         \expandafter\bbl at intraspace\bbl at KVP@intraspace\@@
-       \fi
        \ifx\bbl at KVP@intrapenalty\@nil\else
          \expandafter\bbl at intrapenalty\bbl at KVP@intrapenalty\@@
        \fi}%
@@ -12080,13 +12104,15 @@ help from Bernd Raichle, for which I am grateful.
       local last_char = nil
       local quad = 655360      % 10 pt = 655360 = 10 * 65536
       local last_class = nil
+      local last_lang = nil
 
       for item in node.traverse(head) do
         if item.id == GLYPH then
         
+          local lang = item.lang
+
           local LOCALE = node.get_attribute(item,
                 luatexbase.registernumber'bbl at attr@locale')
-
           local props = Babel.locale_props[LOCALE]
 
           class = Babel.cjk_class[item.char].c
@@ -12100,7 +12126,9 @@ help from Bernd Raichle, for which I am grateful.
             br = 0
           end
 
-          if br == 1 and props.linebreak == 'c' then
+          if br == 1 and props.linebreak == 'c' and
+              lang ~= \the\l at nohyphenation\space and
+              last_lang ~= \the\l at nohyphenation then
             local intrapenalty = props.intrapenalty
             if intrapenalty ~= 0 then
               local n = node.new(14, 0)     % penalty
@@ -12117,6 +12145,7 @@ help from Bernd Raichle, for which I am grateful.
 
           quad = font.getfont(item.font).size
           last_class = class
+          last_lang = lang
         else % if penalty, glue or anything else
           last_class = nil
         end
@@ -12199,6 +12228,51 @@ help from Bernd Raichle, for which I am grateful.
 \AtBeginDocument{\bbl at luafixboxdir}
 %    \end{macrocode}
 %
+% The code for |\babelcharproperty| is straightforward. Just note the
+% modified lua table can be different.
+%
+%    \begin{macrocode}
+\newcommand\babelcharproperty[1]{%
+  \count@=#1\relax
+  \ifvmode
+    \expandafter\bbl at chprop
+  \else
+    \bbl at error{\string\babelcharproperty\space can be used only in\\%
+               vertical mode (preamble or between paragraphs)}%
+              {See the manual for futher info}%
+  \fi}
+\newcommand\bbl at chprop[3][\the\count@]{%
+  \@tempcnta=#1\relax
+  \bbl at ifunset{bbl at chprop@#2}%
+    {\bbl at error{No property named '#2'. Allowed values are\\%
+                direction (bc), mirror (bmg), and linebreak (lb)}%
+               {See the manual for futher info}}%
+    {}%
+  \loop
+    \@nameuse{bbl at chprop@#2}{#3}%
+  \ifnum\count@<\@tempcnta
+    \advance\count@\@ne
+  \repeat}
+\def\bbl at chprop@direction#1{%
+  \directlua{
+    Babel.characters[\the\count@] =  Babel.characters[\the\count@] or {}
+    Babel.characters[\the\count@]['d'] = '#1'
+  }}
+\let\bbl at chprop@bc\bbl at chprop@direction
+\def\bbl at chprop@mirror#1{%
+  \directlua{
+    Babel.characters[\the\count@] =  Babel.characters[\the\count@] or {}
+    Babel.characters[\the\count@]['m'] = '\number#1' 
+  }}
+\let\bbl at chprop@bmg\bbl at chprop@mirror
+\def\bbl at chprop@linebreak#1{%
+  \directlua{
+    Babel.Babel.cjk_characters[\the\count@] = Babel.Babel.cjk_characters[\the\count@] or {}
+    Babel.Babel.cjk_characters[\the\count@]['c'] = '#1'
+  }}
+\let\bbl at chprop@lb\bbl at chprop@linebreak
+%    \end{macrocode}
+%
 % \subsection{Layout}
 %
 % \textbf{Work in progress}.
@@ -19399,12 +19473,13 @@ Babel.cjk_breaks = {
 %    command, \texttt{nil} could be an `unknown' language in which
 %    case we have to make it known.
 %
+% \changes{babel-3.32}{2012/12/21}{Don't set it to
+%   \cs{l at nohyphenation}, best reserved fo special uses.}
+%
 %    \begin{macrocode}
-\ifx\l at nohyphenation\@undefined
-   \@nopatterns{nil}
-   \adddialect\l at nil0
-\else
-   \let\l at nil\l at nohyphenation
+\ifx\l at nil\@undefined
+  \newlanguage\l at nil
+  \@namedef{bbl at hyphendata@\the\l at nil}{{}{}}% Remove warning
 \fi
 %    \end{macrocode}
 %
diff --git a/babel.ins b/babel.ins
index 5205bba..986cbdf 100644
--- a/babel.ins
+++ b/babel.ins
@@ -26,7 +26,7 @@
 %% and covered by LPPL is defined by the unpacking scripts (with
 %% extension .ins) which are part of the distribution.
 %%
-\def\filedate{2019/05/04}
+\def\filedate{2019/05/13}
 \def\batchfile{babel.ins}
 \input docstrip.tex
 
diff --git a/babel.pdf b/babel.pdf
index 2f219c7..14ffe17 100644
Binary files a/babel.pdf and b/babel.pdf differ
diff --git a/bbcompat.dtx b/bbcompat.dtx
index ca62d42..13d0663 100644
--- a/bbcompat.dtx
+++ b/bbcompat.dtx
@@ -30,7 +30,7 @@
 %
 % \iffalse
 %<*dtx>
-\ProvidesFile{bbcompat.dtx}[2019/05/04 v3.31]
+\ProvidesFile{bbcompat.dtx}[2019/05/13 v3.31.1640]
 %</dtx>
 %
 %% File 'bbcompat.dtx'





More information about the latex3-commits mailing list