[latex3-commits] [latex3/latex2e] doc-align-issue: some doc on a TeX bug (2882ebad0)

github at latex-project.org github at latex-project.org
Mon Oct 14 23:01:35 CEST 2024


Repository : https://github.com/latex3/latex2e
On branch  : doc-align-issue
Link       : https://github.com/latex3/latex2e/commit/2882ebad0838b80e87d0fc6bba0cc4fee44b548b

>---------------------------------------------------------------

commit 2882ebad0838b80e87d0fc6bba0cc4fee44b548b
Author: Frank Mittelbach <frank.mittelbach at latex-project.org>
Date:   Mon Oct 14 23:01:35 2024 +0200

    some doc on a TeX bug


>---------------------------------------------------------------

2882ebad0838b80e87d0fc6bba0cc4fee44b548b
 required/tools/array.dtx | 141 +++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 117 insertions(+), 24 deletions(-)

diff --git a/required/tools/array.dtx b/required/tools/array.dtx
index f141f069e..7bdd502ff 100644
--- a/required/tools/array.dtx
+++ b/required/tools/array.dtx
@@ -39,7 +39,7 @@
 %    \begin{macrocode}
 %<+package>\NeedsTeXFormat{LaTeX2e}[2024/06/01]
 %<+package>\ProvidesPackage{array}
-%<+package>         [2024/10/12 v2.6g Tabular extension package (FMi)]
+%<+package>         [2024/10/14 v2.6g Tabular extension package (FMi)]
 %
 % \fi
 %
@@ -382,6 +382,20 @@
 % \end{itemize}
 %
 %
+%
+% \subsection{A note on the allowed content of \texttt{>\{...\}} and
+%             \texttt{<\{...\}}}
+%
+% These specifiers are meant to hold declarations, such as
+% \verb=>{\itshape}=. They cannot end in commands that take arguments
+% without providing these arguments as part of the \verb={...}=.  It
+% would be a mistaken assumption that they pick up all or parts of the
+% alignment entry data if their argument is not provided.  E.g.,
+% \verb=>{\textbf}= would not make the whole column bold nor would it
+% make the first character bold (technically it would try to
+% bolden \cs{ignorespaces}). Thus, it would not fail with an error,
+% but effectively the output would be wrong and not as expected.
+%
 % \subsection{The behavior of the \texttt{\string\\} command}
 %
 % In the basic \texttt{tabular} implementation of \LaTeX{} the \cs{\bslash}
@@ -1386,33 +1400,112 @@ Bug reports can be opened (category \texttt{#1}) at\\%
 %    \begin{macrocode}
   \UseTaggingSocket{tbl/cell/begin}%
 %    \end{macrocode}
-%    Here, we assume that the \textsf{count} register
+%     Next we have to insert the toks register holding the content of
+%    \verb=>{...}=.  Here, we assume that the \textsf{count} register
 %    =\@tempcnta= has saved the value $=\count@= - 1$.
 %
-%    To keep \TeX{} happy if there is a look ahead in the tabular preamble
-%    which uses the Appendix~D trick (for example anything with a trailing
-%    optional argument defined by \pkg{ltcmd}), we wrap everything here in
-%    a protected version of \cs{@firstofone}.\footnote{The reason this
-%    works is not really clear: almost certainly there is a bug in \TeX{}
-%    here that we are simply avoiding, but as the master counter doesn't
-%    show up in a trace, a full understanding likely means working through
-%    the code that implements \cs{halign}!}
-%    \TeX{} otherwise can get
+%    To keep \TeX{} happy if there is a look ahead in the tabular
+%    preamble, i.e., starting in \verb=>{...}=, which uses the
+%    Appendix~D trick (for example, anything with a trailing optional
+%    argument defined by \pkg{ltcmd}), we wrap everything here in a
+%    protected version of \cs{@firstofone}.  \TeX{} otherwise can get
 %    confused about the value of the master counter, and we get some
-%    strange errors. (Quite possibly the underlying issue is a \TeX{}
-%    bug, but rather than try to fix in 2024 we accept it's there and
-%    work-around.) As an example, without this approach, something
-%    like
+%    strange errors. We suspected that there was an underlying issue
+%    is the \TeX{} engine, but it turned out to be rather hard to get
+%    to the bottom of it, because the master counter is not accessible
+%    through \TeX{}'s tracing tools. Thus, all we could do was
+%    producing various example documents, observing results, as well
+%    as staring at a printout of the \TeX{} program. As an example,
+%    without this approach, something like
 %    \begin{verbatim}
-%\NewDocumentCommand\foo{o}{x}
-%\begin{tabular}{>{\foo}l}
-%  Foo 
-%\end{tabular}
+% \NewDocumentCommand\foo{o}{x}
+% \begin{tabular}{>{\foo}l}
+%    Foo 
+% \end{tabular}
 %    \end{verbatim}
-%    will fail; that can be fixed by adding a \cs{relax} after the \cs{@tempcnta},
-%    but that then leads to issues if you are collecting whole cells (tagging code
-%    or \\pkg{collcell}), where you can no longer alter the meaning of \cs{cr}
-%    as the master counter goes wrong.
+%    failed. That can be fixed by adding a \cs{relax} after the
+%    \cs{@tempcnta}, but that then leads to issues if you are
+%    collecting whole cells (tagging code or \pkg{collcell}), where
+%    you can no longer alter the meaning of \cs{cr} as the master
+%    counter goes wrong due to an obscure bug (or perhaps, say, an
+%    undocumented feature pf \TeX{}).  Eventually, we were able to pin
+%    down the the root cause and really understand why
+%    \cs{@protected at firstofone} solves the problem, even though it
+%    looks like a nonsense addition to the code that does nothing
+%    useful.\footnote{So it is a \TeX{} engine bug that was in there
+%    from day one, or if you like, it is a hidden feature that is not
+%    explained; neither in the \TeX{}book nor in the program code. We
+%    don't really expect this to change in \TeX{} after such a long
+%    time, other than perhaps documenting it as a feature, so this is
+%    a proper solution to the problem and not just a workaround.}
+%
+%    The problem is that \TeX{} tries to conserve stack space, and
+%    when the last token of an existing tokenlist is a macro, then
+%    this token list is \emph{first} removed from memory (reducing the
+%    stack) \emph{before} the the macro replacement text (as a new
+%    tokenlist) is given to the parser adding a new stack level. This
+%    is done using the routine \texttt{end\_token\_list} in the \TeX{}
+%    program and ending the u-part of an \cs{halign} column with this
+%    routine immediately sets the \emph{master counter} used by alignments to
+%    zero (see chapter~22 and Appendix~D of the \TeX{}book). This
+%    means that technically the expansion of the last token in the u-part (if it
+%    is a macro) is not executed in the context of the u-part, but in
+%    the context of the alignment entry in the document. That normally
+%    doesn't make any difference whatesoever --- unless you play
+%    around (as we sometimes have to) with tricks like those from
+%    Appendix~D.
+%
+%    To illustrate the issue we show a bit of strange low-level plain
+%    \TeX{} code.\footnote{If all of this looks mighty strange to you,
+%    don't worry. You will be unlikely to need to know about it. It is
+%    just there so that programmers at some point in the future do not
+%    have to wonder too much why there is this odd
+%    \cs{@protected at firstofone} that appearently does nothing
+%    useful. It took us several nights of head scratching to come up
+%    with these minimal examples and then some more time to understand
+%    what the heck is going on inside \TeX{}---thanks to Bruno for the
+%    right ideas on the latter.}  Below are two very special grouping
+%    commands that are like \cs{bgroup} and \cs{egroup} but also
+%    affect the alignment master counter when expanded (see
+%    \TeX{}book p.385).  If one of them is used as the
+%    last macro in the u-part of a column, then you get strange errors
+%    that you shouldn't get. 
+% \begin{verbatim}
+% \def\bbgroup{{\ifnum0=`}\fi}
+% \def\eegroup{\ifnum0=`{\fi}}
+% 
+% % Fails with an error message, but there should be none:
+% \halign{%
+%   \message{u-part^^J}%
+%   \bbgroup              % <-- in the u-part
+%   \eegroup              % <-- in the u-part
+%   #%
+%   \message{v-part^^J}%
+%   \hfill\cr
+%   \message{body^^J}x
+%   \cr
+% }
+% 
+% % Fails but should work, the v-part is never reached:
+% \halign{%
+%   \message{u-part^^J}%
+%   \bbgroup              % <-- in the u-part
+%   #%
+%   \message{v-part^^JJ}%
+%   \eegroup              % <-- in the v-part
+%   \hfill\cr
+%   \message{body^^J}x
+%   \cr
+% }
+% \end{verbatim}
+%
+%    So the trick we use now is making \cs{@protected at firstofone} the
+%    last macro in the u-part, i.e., before the \cs{@sharp}. That way
+%    its argument is always fully expanded as part of the alignment
+%    entry and not as part of the u-part and this way we know exactly
+%    what the master counter value is at this point, regardless of the content of
+%    \verb=>{...}=.
+%
 % \changes{v2.6f}{2024/09/13}{Stop parsing for optional argument (gh/1468)}
 % \changes{v2.6g}{2024/10/12}{Further work to support optional args in preamble (gh/1468)}
 %    \begin{macrocode}
@@ -2502,7 +2595,7 @@ Bug reports can be opened (category \texttt{#1}) at\\%
 %    that a =&= will not be considered belonging to the current
 %    =\halign= while we are looking for a =*= or =[=.
 %    For further information see
-%    \cite[Appendix D]{bk:knuth}.
+%    \cite[Appendix~D]{bk:knuth}.
 %    \begin{macrocode}
   \iffalse{\fi\ifnum 0=`}\fi
 %    \end{macrocode}





More information about the latex3-commits mailing list.