[latex3-commits] [git/LaTeX3-latex3-latex3] main: l3regex: mention TeX's expansion for composing regex [ci skip] (fbd527950)

Bruno Le Floch blflatex at gmail.com
Tue Apr 27 15:56:19 CEST 2021


Repository : https://github.com/latex3/latex3
On branch  : main
Link       : https://github.com/latex3/latex3/commit/fbd5279504b6663b00c53a55cd8ceaae22617dfc

>---------------------------------------------------------------

commit fbd5279504b6663b00c53a55cd8ceaae22617dfc
Author: Bruno Le Floch <blflatex at gmail.com>
Date:   Mon Apr 26 23:23:21 2021 +0200

    l3regex: mention TeX's expansion for composing regex [ci skip]


>---------------------------------------------------------------

fbd5279504b6663b00c53a55cd8ceaae22617dfc
 l3kernel/l3regex.dtx | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/l3kernel/l3regex.dtx b/l3kernel/l3regex.dtx
index 166574165..0064bdc2b 100644
--- a/l3kernel/l3regex.dtx
+++ b/l3kernel/l3regex.dtx
@@ -357,27 +357,32 @@
 % The |\u| escape sequence allows to insert the contents of a token list
 % directly into a regular expression or a replacement, avoiding the need
 % to escape special characters. Namely, |\u|\Arg{var~name} matches
-% the exact contents of the variable \cs[no-index]{\meta{var~name}},
+% the exact contents (both character codes and category codes) of the
+% variable \cs[no-index]{\meta{var~name}},
 % which are obtained by applying \cs{exp_not:v} \Arg{var~name} at the
 % time the regular expression is compiled. Within a |\c{...}|
 % control sequence matching, the |\u| escape sequence only expands its
 % argument once, in effect performing \cs{tl_to_str:v}.
+% Quantifiers are supported.
 %
 % The |\ur| escape sequence allows to insert the contents of a |regex|
 % variable into a larger regular expression.  For instance,
 % |A\ur{l_tmpa_regex}D| matches the tokens |A| and |D| separated by
 % something that matches the regular expression
-% \cs[no-index]{l_tmpa_regex}.  This behaves as if a (non-capturing)
-% group were surrounding \cs[no-index]{l_tmpa_regex}: for instance, if
-% that regex variable has value \verb"B|C", then |A\ur{l_tmpa_regex}D|
-% is equivalent to \verb"A(?:B|C)D" (matching |ABD| or |ACD|) and not to
-% \verb"AB|CD" (matching |AB| or |CD|).
-%
-% Quantifiers are supported for the |\u| and |\ur| constructions: for
-% instance, after \cs[no-index]{regex_set:Nn}
-% \cs[no-index]{l_item_regex} |{| |[a-z]+| |,?| |}|, the regex
-% |\ur{l_item_regex}+| matches one or more \enquote{words} separated by
-% optional commas.
+% \cs[no-index]{l_tmpa_regex}.  This behaves as if a non-capturing group
+% were surrounding \cs[no-index]{l_tmpa_regex} (thus quantifiers are
+% supported).
+%
+% For instance, if \cs[no-index]{l_tmpa_regex} has value \verb"B|C",
+% then |A\ur{l_tmpa_regex}D| is equivalent to \verb"A(?:B|C)D" (matching
+% |ABD| or |ACD|) and not to \verb"AB|CD" (matching |AB| or |CD|).  To
+% get the latter effect, it is simplest to use \TeX{}'s expansion
+% machinery directly: if \cs[no-index]{l_mymodule_BC_tl} contains
+% \verb"B|C" then the following two lines show the same result:
+% \begin{quote}
+%   \cs{regex_show:n} |{ A \u{l_mymodule_BC_tl} D }| \\
+%   \cs{regex_show:n} \verb"{ A B | C D }"
+% \end{quote}
 %
 % \subsection{Miscellaneous}
 %





More information about the latex3-commits mailing list.