[latex3-commits] [latex3/latex2e] file-cache: Cache the file seen status (af306e3d)

github at latex-project.org github at latex-project.org
Sat May 13 19:29:46 CEST 2023


Repository : https://github.com/latex3/latex2e
On branch  : file-cache
Link       : https://github.com/latex3/latex2e/commit/af306e3d74f88b2c94f92e59eec7ea1de58df998

>---------------------------------------------------------------

commit af306e3d74f88b2c94f92e59eec7ea1de58df998
Author: Joseph Wright <joseph.wright at morningstar2.co.uk>
Date:   Wed May 10 22:57:36 2023 +0100

    Cache the file seen status


>---------------------------------------------------------------

af306e3d74f88b2c94f92e59eec7ea1de58df998
 base/changes.txt      |  5 +++++
 base/doc/ltnews37.tex |  8 ++++++++
 base/ltfiles.dtx      | 32 ++++++++++++++++++++++----------
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/base/changes.txt b/base/changes.txt
index 9ad40e33..2fdb3cd4 100644
--- a/base/changes.txt
+++ b/base/changes.txt
@@ -6,6 +6,11 @@ completeness or accuracy and it contains some references to files that
 are not part of the distribution.
 ================================================================================
 
+2023-05-10  Joseph Wright  <Joseph.Wright at latex-project.org>
+
+	* lffiles.dtx:
+	Cache the status of files seen
+
 2023-04-19  Joseph Wright  <joseph.wright at latex-project.org>
 
 	* ltfinal.dtx (subsection{Lccodes and uccodes}):
diff --git a/base/doc/ltnews37.tex b/base/doc/ltnews37.tex
index 1c54de58..fdd50af9 100644
--- a/base/doc/ltnews37.tex
+++ b/base/doc/ltnews37.tex
@@ -361,6 +361,14 @@ not use grouping.
 
 \section{Code improvements}
 
+\subsection{Performance in checking file existence}
+
+The additon of hooks, etc., to file operations had a side-effect in that
+multiple checks were made that the file existed. In larger documents using
+lots of files, these filesystem operations caused non-trivial performance
+impact. We now cache the file seen status, such that these repeated filesystem
+calls are avoided.
+
 \subsection{\pkg{doc}: Handle \texttt{\textbackslash\textvisiblespace} correctly in the index}
 
 Due to some problems in the code it wasn't possible to prevent
diff --git a/base/ltfiles.dtx b/base/ltfiles.dtx
index 04256529..1e2fdcba 100644
--- a/base/ltfiles.dtx
+++ b/base/ltfiles.dtx
@@ -32,7 +32,7 @@
 %<*driver>
 % \fi
 \ProvidesFile{ltfiles.dtx}
-             [2023/01/05 v1.2s LaTeX Kernel (File Handling)]
+             [2023/05/10 v1.2t LaTeX Kernel (File Handling)]
 % \iffalse
 \documentclass{ltxdoc}
 \GetFileInfo{ltfiles.dtx}
@@ -1202,6 +1202,7 @@
 % \changes{v1.0t}{1995/05/25}{(CAR) added \cs{long}}
 % \changes{v1.2d}{2019/10/26}{quote on openin}
 % \changes{v1.2k}{2021/03/12}{Allow unbalanced conditionals (gh/530)}
+% \changes{v1.2t}{2023/05/10}{Cache file status}
 % Argument |#1| is |\@curr at file| so catcode 12 string with no quotes.
 %
 %    The original definition picked up arguments |#2| and |#3| in a
@@ -1211,23 +1212,34 @@
 %    \cs{secondoftwo}. However, that changes how |#| is interpreted
 %    and so we can't do that nowaways without invalidating a lot of
 %    code. Therefore the somewhat curious construction near the end.
+%
+%    To avoid repeatedly checking the same file, we cache the seen status
+%    using a `flag'. The same flag name is used by \pkg{expl3}, meaning
+%    that either this check or the \cs{pdffilesize} one used in the
+%    \pkg{expl3} can save repeated file reading by both code paths.
 %    \begin{macrocode}
 %</2ekernel>
 %<*2ekernel|latexrelease>
 %<latexrelease>\IncludeInRelease{2021/06/01}%
 %<latexrelease>                 {\IfFileExists@}{manage unbalanced conditionals}
 \long\def \IfFileExists@#1#2#3{%
-  \openin\@inputcheck"#1" %
-  \ifeof\@inputcheck
-    \ifx\input at path\@undefined
-      \let\reserved at a\@secondoftwo
-    \else
-      \def\reserved at a{\@iffileonpath{#1}}%
-    \fi
-  \else
-    \closein\@inputcheck
+  \ifcsname\detokenize{__file_seen_#1:}\endcsname
     \edef\@filef at und{"#1" }%
     \let\reserved at a\@firstoftwo
+  \else
+    \openin\@inputcheck"#1" %
+    \ifeof\@inputcheck
+      \ifx\input at path\@undefined
+        \let\reserved at a\@secondoftwo
+      \else
+        \def\reserved at a{\@iffileonpath{#1}}%
+      \fi
+    \else
+      \closein\@inputcheck
+      \csname\detokenize{__file_seen_#1:}\endcsname
+      \edef\@filef at und{"#1" }%
+      \let\reserved at a\@firstoftwo
+    \fi
   \fi
 %    \end{macrocode}
 %    This is just there so that any |#| inside |#2| or |#3| needs





More information about the latex3-commits mailing list.