WIN NT file searching problems in (pdf)tex

Olaf Weber olaf@infovore.xs4all.nl
22 Mar 1999 08:48:06 +0100


Hans Hagen writes:
> Fabrice POPINEAU wrote:
>> Hans Hagen <pragma@wxs.nl> writes:

>>> ConTeXt uses the auxiliary file 'jobname.tuo'. When this file is input,
>>> tex is supposed to search for only this file. When using this remote
>>> disk however, I got something like this:

>>> (jobname.tuo.tex) file access problem
>>> (jobname.tuo.tex) file acesss problem
>>> (jobname.tuo)

>>> So, tex seems to ask for .tex files first (two times) and then for the
>>> real one.

I don't seem able to duplicate this behaviour (the double open of
xxx.tex) on my machine (Linux).  So I don't have an explanation for
the double search for '.tex' that you found.

Unless, possibly, the '.tex' searches were in two different
directories?  Try using the kpathsea debugging options to find out
which open attempts are made.

>>>            I'm not sure of it should be in the design of current tex to
>>> handle files with more than one dot, but I think it makes sense to
>>> search for the file asked for first. It seems like the .tuo is not
>>> considered to be part of the filename. I have no explanation for the
>>> double search for '.tex'.

Actually, the point is precisely that .tuo _is_ considered part of the
filename, rather than an extension.

>> This is not the way kpathsea does its job :

>> /* Search #1: NAME doesn't have a suffix which is equal to a "standard"
>> suffix.  For example, foo.bar, but not foo.tex.  We look for the
>> name with the standard suffixes appended. */

>> /* Search #2: Just look for the name we've been given, provided non-suffix
>> searches are allowed or the name already includes a suffix. */

>> /* Search #3 (sort of): Call mktextfm or whatever to create a
>> missing file.  */

>> These are the files that are looked for.
>> I guess the second try on .tuo.tex could be avoided by some !! .

>> I don't  think  that we can  do  much about  the search  order without
>> breaking  previous  behaviour  (Olaf ?).   However,   yes, it would be
>> plausible to first  look for what filename  has been given, especially
>> if it has already an extension.

Any change will break things that rely on that aspect of the previous
behaviour.  It may still be worth making the change.  The searching
code has already undergone a number of changes in an attempt to
simplify it and make its handling of names more consistent.

> How about this: 

> I think it makes sense when a filename has an . in it, first to try to
> open this file, nothing appended. Context as well as LaTeX uses
> auxiliary files, and now opening these always leads to 3 file open
> attempts. (I often use buffers, *.tmp, or *.mp files and we're for some
> documents talking of an overhead of thousands op unneeded open
> attempts.) 

There have been a number of attempts to get the file search code work
better.  The current solution, "try <file>.tex if <file> doesn't end
on '.tex', then try <file>," is at least simple.  The alternative you
propose would be something like "if <file> contains a period, try
<file>, if it doesn't end on '.tex', try <file>.tex as well;
otherwise, try <file>.tex, then <file>."

One possible problem with the current situation is that a "try"
involves searching the directory trees.  So with TEXINPUTS=.:/foo,
/foo/bar.tuo.tex would be found before a ./foo.tuo.  This is not
necessarily desirable.

For 7.4, I'm considering switching to system where a list of
acceptable filenames is created first, and each of these names is
checked in a directory before moving on to the next directory.  It
seems to me that, even if the handling of the '.tex' extension doesn't
otherwise change, this should help performance: under a sane setup
(. first in TEXINPUTS) you'd have one failed open attempt for each one
that succeeds, and both of these in the same directory, which is easy
on the file system caches of the OS.  Such a setup would also make it
easier to experiment with different orderings depending on wheter a
filename contains a period.

By the way, you mention *.mp temporary files.  That's the same
extension as used by metapost source files.  Are you certain that's a
good idea?

> Also, is there a saver way of trying to open the file? No access error
> message? It has nothing to do with access problems if a file cannot be
> found. 

Thanks to race conditions that could otherwise occur, the surest way
to determine whether a file can be opened is to try to open it, and
check whether you succeeded.  Any "safer" way of doing this would just 
complicate the code without improving the safety.

If Win32 sees fit to report access problems when in fact a file is
absent, then that is an Win32 problem.

-- 
Olaf Weber

Do not meddle in the affairs of sysadmins,
        for they are quick to anger and have no need for subtlety.