WIN NT file searching problems in (pdf)tex

Hans Hagen pragma@wxs.nl
Mon, 22 Mar 1999 20:32:50 +0100


Olaf Weber wrote:

> Actually, the point is precisely that .tuo _is_ considered part of the
> filename, rather than an extension.

I already feared this. Is this what users expect? Many macro packages
use temporary and auxiliary files, where the explicitly added prefix
already points to not wanting it to be '.tex'. 
 
> >> This is not the way kpathsea does its job :
> 
> >> /* Search #1: NAME doesn't have a suffix which is equal to a "standard"
> >> suffix.  For example, foo.bar, but not foo.tex.  We look for the
> >> name with the standard suffixes appended. */
> 
> >> /* Search #2: Just look for the name we've been given, provided non-suffix
> >> searches are allowed or the name already includes a suffix. */
> 
> >> /* Search #3 (sort of): Call mktextfm or whatever to create a
> >> missing file.  */
> 
> >> These are the files that are looked for.
> >> I guess the second try on .tuo.tex could be avoided by some !! .
> 
> >> I don't  think  that we can  do  much about  the search  order without
> >> breaking  previous  behaviour  (Olaf ?).   However,   yes, it would be
> >> plausible to first  look for what filename  has been given, especially
> >> if it has already an extension.
> 
> Any change will break things that rely on that aspect of the previous
> behaviour.  It may still be worth making the change.  The searching
> code has already undergone a number of changes in an attempt to
> simplify it and make its handling of names more consistent.

I don't think macro packages will break when you change the search (i.e.
when a prefix is specified try to open the file). 

(1) it is more consistent with opening a file in write mode, where no
suffix is added either.

(2) On for instance dos there are no double suffixes! So, a macro
package using this 'feature' would not run on dos. I'm not sure about
the amiga and vms. 

> > How about this:
> 
> > I think it makes sense when a filename has an . in it, first to try to
> > open this file, nothing appended. Context as well as LaTeX uses
> > auxiliary files, and now opening these always leads to 3 file open
> > attempts. (I often use buffers, *.tmp, or *.mp files and we're for some
> > documents talking of an overhead of thousands op unneeded open
> > attempts.)
> 
> There have been a number of attempts to get the file search code work
> better.  The current solution, "try <file>.tex if <file> doesn't end
> on '.tex', then try <file>," is at least simple.  The alternative you
> propose would be something like "if <file> contains a period, try
> <file>, if it doesn't end on '.tex', try <file>.tex as well;
> otherwise, try <file>.tex, then <file>."

Yes, that makes much sense and saves quite some unneeded network trafic.
 
> One possible problem with the current situation is that a "try"
> involves searching the directory trees.  So with TEXINPUTS=.:/foo,
> /foo/bar.tuo.tex would be found before a ./foo.tuo.  This is not
> necessarily desirable.

Ah, so you first search for all possibilities on one dir, then the next,
etc. Hm. When searching for a tuo I prefix with ./ anyway, just because
I don't want to open an absolutely non related auxiliary file (we often
have files with similar names, like 'layout', 'test', 'course',
'manual'). 
 
> For 7.4, I'm considering switching to system where a list of
> acceptable filenames is created first, and each of these names is
> checked in a directory before moving on to the next directory.  It
> seems to me that, even if the handling of the '.tex' extension doesn't
> otherwise change, this should help performance: under a sane setup
> (. first in TEXINPUTS) you'd have one failed open attempt for each one
> that succeeds, and both of these in the same directory, which is easy
> on the file system caches of the OS.  Such a setup would also make it
> easier to experiment with different orderings depending on wheter a
> filename contains a period.

Sounds perfect. 
 
> By the way, you mention *.mp temporary files.  That's the same
> extension as used by metapost source files.  Are you certain that's a
> good idea?

Those are real mp files. I often include mp code in the tex source and
call mp run-time. Think of generating backgrounds, flowcharts, graphics.
The beginners manual of context has a random shape around each chapter
title and setup, page, and margin thing, so there we generate about 900
mp files and run them directly (and let tex convert them to pdf):
normally there is one temp file (mp-graph.mp) but many output files
(mp-graph.1, mpgraph.2 etc).  
 
> > Also, is there a saver way of trying to open the file? No access error
> > message? It has nothing to do with access problems if a file cannot be
> > found.
> 
> Thanks to race conditions that could otherwise occur, the surest way
> to determine whether a file can be opened is to try to open it, and
> check whether you succeeded.  Any "safer" way of doing this would just
> complicate the code without improving the safety.

Ok, only the message is confusing, especially on Windows NT (I don't use
it, I only observed this behavior).  
 
> If Win32 sees fit to report access problems when in fact a file is
> absent, then that is an Win32 problem.

Well, that's for fabrice to take care of. Maybe another system call? 
 
> Do not meddle in the affairs of sysadmins,

Sure. I'm only a user. 

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.nl
-----------------------------------------------------------------