[tex-live] kpsewhich case insensitive?

Karl Berry karl at freefriends.org
Sun Apr 8 00:42:28 CEST 2018


    > How case sensitivity checking works under the hood 

Well, this is probably too long for anyone to read, but below is what I
wrote about it in the manual. As for the implementation, check out the
calls to the new function casefold_readable_file in
kpathsea/pathsearch.c. The casefolding operations are also reported in
various ways in the debugging output.

Scroll down to the end, and you'll see that kpse never even tries to
determine if a given filesystem is case-sensitive or not (because
there's no good way to do so). All it does is check whether access()
on a given name succeeds.

Also, the change is only seen on *case-sensitive* systems (normal
Unix). Mac people using the default case-insensitive Mac filesystem will
not see any difference, nor will Windows people (the new code is not
even compiled on Windows, as noted below). --best, karl.

-----------------------------------------------------------------------------
[From kpathsea.info.]
   ...

   The fallback case-insensitive search is omitted at compile-time on
Windows, where (for practical purposes) all file names are
case-insensitive at the kernel level, and so the normal search will
already have definitively matched or not.  Therefore, search results in
unusual cases can be different on Windows and Unix--but this has always
been true.

File: kpathsea.info,  Node: Casefolding examples,  Prev: Casefolding rationale,  Up: Casefolding search

5.4.2 Casefolding examples
--------------------------

The casefolding implementation prefers exact matches to casefolded
matches within a given path element, so as to retain most compatibility.
Backward compatibility is not perfect, however, as a casefolded match
may be found in an earlier path element than an exact match was
previously found (see example #4 below).  Still, preferring the match in
the earlier element seemed potentially less confusing than otherwise,
and is in fact consistent with past behavior on Windows.  Since case
mismatches are rare to begin with, and name collisions with respect only
to case thus even more rare, the hope is that it will not cause
difficulties in practice.

   If it's desirable in a given situation to have the exact same search
behavior as previously, that can be accomplished by setting the
configuration variable 'texmf_casefold_search' to '0' (*note Path
sources::).

   Some examples to illustrate the new behavior follow.

   Example #1: suppose the file './foobar.tex' exists.  Now, searching
for './FooBar.TeX' (or any other case variation) will succeed, returning
'./foobar.tex'--the name as stored on disk.  In previous releases, or if
'texmf_casefold_search' is false, the search would fail.

   Example #2: suppose we are using a case-sensitive (file)system, and
the search path is '.:/somedir', and the files './foobar.tex' and
'/somedir/FooBar.TeX' both exist.  Both now and previously, searching
for 'foobar.tex' returns './foobar.tex'.  However, searching for
'FooBar.TeX' now returns './foobar.tex' instead of
'/somedir/FooBar.TeX'; this is the incompatibility mentioned above.
Also (as expected), searching for 'FOOBAR.TEX' (or whatever variation)
will now return './foobar.tex', whereas before it would fail.  Searching
for all ('kpsewhich --all') 'foobar.tex' will return both matches.

   Example #3: same as example #2, but on a case-insensitive
(file)system: both now and previously, searching for 'FooBar.TeX'
returns './foobar.tex', since the system considers that a match.  The
Kpathsea casefolding never comes into play.

   Example #4: if we have (on a case-sensitive system) both
'./foobar.tex' and './FOOBAR.TEX', searching with the exact case returns
that exact match, now and previously.  Searching for 'FooBar.tex' will
now return one or the other (chosen arbitrarily), rather than failing.
Perhaps unexpectedly, searching for all 'foobar.tex' or 'FooBar.tex'
will also return only one or the other, not both (see more below).

   Example #5: the font file 'STIX-Regular.otf' is included in TeX Live
in the system directory 'texmf-dist/fonts/opentype/public/stix'.
Because Kpathsea never searches the disk in the big system directory,
the casefolding is not done, and a search for 'stix-regular.otf' will
fail (on case-sensitive systems), as it always has.

   The caveat about not searching the disk amounts to saying that
casefolding does not happen in the trees specified with '!!' (*note
ls-R::), that is, where only database ('ls-R') searching is done.  In
TeX Live, that is the 'texmf-local' and 'texmf-dist' trees (also
'$TEXMFSYSCONFIG' and '$TEXMFSYSVAR', but those are rarely noticed).
The rationale for this is that in practice, case mangling happens with
user-created files, not with packages distributed as part of the TeX
system.

   One more caveat: the purpose of 'kpsewhich' is to exercise the path
searching in Kpathsea as it is actually done.  Therefore, as shown
above, 'kpsewhich --all' will not return all matches regardless of case
within a given path element.  If you want to find all matches in all
directories, 'find' is the best tool, although the setup takes a couple
steps:

     kpsewhich -show-path=tex >/tmp/texpath      # search path specification
     kpsewhich -expand-path="`cat /tmp/texpath`" >/tmp/texdirs  # all dirs
     tr ':' '\n' </tmp/texdirs >/tmp/texdirlist  # colons to newlines
     find `cat /tmp/texdirlist` -iname somefile.tex -print </tmp/texdirlist

   Sorry that it's annoyingly lengthy, but implementing this inside
Kpathsea would be a lot of error-prone trouble for something that is
only useful for debugging.  If your 'find' does not support '-iname',
you can get GNU Find from <https://www.gnu.org/software/findutils>.

   The casefolding search is implemented in the source file
'kpathsea/pathsearch.c'.  Two implementation points:

   * Kpathsea never tries to check if a given directory resides on a
     case-insensitive filesystem, because there is no efficient and
     portable way to do so.  All it does is try to see if a potential
     file name is a readable normal file (with, usually, the 'access'
     system call).

   * Kpathsea does not do any case-insensitive matching of the
     directories along the path.  It's not going to find
     '/Some/Random/file.tex' when looking for '/some/random/file.tex'.
     The casefolding only happens with the elements of the leaf
     directory.


More information about the tex-live mailing list