[tex-live] kpathsea performance

Karl Berry karl at freefriends.org
Sun Aug 26 23:50:09 CEST 2018

    we do use .FIG and .AUX, Iœôòùm not
    sure if those are our custom includes or not

Those should be fine (.aux/.AUX is already a TEXINPUTS extension). It's
only using some predefined extension for another file type that could
cause trouble. Seems that is not at issue.

    What might be interesting (and it makes sense), we have 3 directories that
    we pass to tex to search through i.e. texinputs=dir1;dir2;dir3. In our case

Ah. I think that explains it. In 2017, kpse just looked for dir1/FOO.FIG
(and failed), dir2/FOO.FIG (and failed), and then succeeded with
dir3/FOO.FIG. Now, it fails for dir1/FOO.FIG and then readdir()s through
dir1 looking for "foo.fig" (strcasecmp-wise).

I considered keeping the old behavior, but a couple things mitigated
against it:
1) the current behavior is how it's always worked on Windows (because
   Windows operates case-insensitively);
2) it seems at least as sensible to prefer an imperfect match in an
   earlier directory as the converse.

I admit there was also:
3) it was easier to fit the new feature into the existing code that way.

    have many thousands of files in these directories, 

It will take time to read such huge directories, yes. So although the
outcome here is unfortunate, I'm not sure there is anything to do to
improve it :(.

I'm not sure if it's feasible, but it occurs to me that you might be
able to speed things up (to a constant) by creating/maintaining an ls-R
file for the "tree" containing these huge directories.

It's also true that when I wrote the new bit of code, I assumed that
disk caching would pretty much take care of repeated searches (as it did
on all the systems I could check). With such huge directories, it's
certainly possible that the disk caching gets overloaded. The filesystem
type, available ram, etc., etc., are all going to be factors.


More information about the tex-live mailing list