[latex3-commits] [git/LaTeX3-latex3-latex3] main: l3regex: mention TeX's expansion for composing regex [ci skip] (fbd527950)
Bruno Le Floch
blflatex at gmail.com
Tue Apr 27 15:56:19 CEST 2021
Repository : https://github.com/latex3/latex3
On branch : main
Link : https://github.com/latex3/latex3/commit/fbd5279504b6663b00c53a55cd8ceaae22617dfc
>---------------------------------------------------------------
commit fbd5279504b6663b00c53a55cd8ceaae22617dfc
Author: Bruno Le Floch <blflatex at gmail.com>
Date: Mon Apr 26 23:23:21 2021 +0200
l3regex: mention TeX's expansion for composing regex [ci skip]
>---------------------------------------------------------------
fbd5279504b6663b00c53a55cd8ceaae22617dfc
l3kernel/l3regex.dtx | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/l3kernel/l3regex.dtx b/l3kernel/l3regex.dtx
index 166574165..0064bdc2b 100644
--- a/l3kernel/l3regex.dtx
+++ b/l3kernel/l3regex.dtx
@@ -357,27 +357,32 @@
% The |\u| escape sequence allows to insert the contents of a token list
% directly into a regular expression or a replacement, avoiding the need
% to escape special characters. Namely, |\u|\Arg{var~name} matches
-% the exact contents of the variable \cs[no-index]{\meta{var~name}},
+% the exact contents (both character codes and category codes) of the
+% variable \cs[no-index]{\meta{var~name}},
% which are obtained by applying \cs{exp_not:v} \Arg{var~name} at the
% time the regular expression is compiled. Within a |\c{...}|
% control sequence matching, the |\u| escape sequence only expands its
% argument once, in effect performing \cs{tl_to_str:v}.
+% Quantifiers are supported.
%
% The |\ur| escape sequence allows to insert the contents of a |regex|
% variable into a larger regular expression. For instance,
% |A\ur{l_tmpa_regex}D| matches the tokens |A| and |D| separated by
% something that matches the regular expression
-% \cs[no-index]{l_tmpa_regex}. This behaves as if a (non-capturing)
-% group were surrounding \cs[no-index]{l_tmpa_regex}: for instance, if
-% that regex variable has value \verb"B|C", then |A\ur{l_tmpa_regex}D|
-% is equivalent to \verb"A(?:B|C)D" (matching |ABD| or |ACD|) and not to
-% \verb"AB|CD" (matching |AB| or |CD|).
-%
-% Quantifiers are supported for the |\u| and |\ur| constructions: for
-% instance, after \cs[no-index]{regex_set:Nn}
-% \cs[no-index]{l_item_regex} |{| |[a-z]+| |,?| |}|, the regex
-% |\ur{l_item_regex}+| matches one or more \enquote{words} separated by
-% optional commas.
+% \cs[no-index]{l_tmpa_regex}. This behaves as if a non-capturing group
+% were surrounding \cs[no-index]{l_tmpa_regex} (thus quantifiers are
+% supported).
+%
+% For instance, if \cs[no-index]{l_tmpa_regex} has value \verb"B|C",
+% then |A\ur{l_tmpa_regex}D| is equivalent to \verb"A(?:B|C)D" (matching
+% |ABD| or |ACD|) and not to \verb"AB|CD" (matching |AB| or |CD|). To
+% get the latter effect, it is simplest to use \TeX{}'s expansion
+% machinery directly: if \cs[no-index]{l_mymodule_BC_tl} contains
+% \verb"B|C" then the following two lines show the same result:
+% \begin{quote}
+% \cs{regex_show:n} |{ A \u{l_mymodule_BC_tl} D }| \\
+% \cs{regex_show:n} \verb"{ A B | C D }"
+% \end{quote}
%
% \subsection{Miscellaneous}
%
More information about the latex3-commits
mailing list.