[pdftex] Patch for better performance of map file reading

Heiko Oberdiek oberdiek at uni-freiburg.de
Wed Jul 31 13:21:51 CEST 2002


On Wed, Jul 31, 2002 at 10:01:39AM +0200, Hans Hagen wrote:
> Hi Heiko,
> 
> >+     Description:
> >+       Most tfm file names obey the Karl-Berry-schema, so the
> >+       string length is not greater than 8. The string is
> >+       converted into an int by dividing the first eight
> >+       bytes into two byte parts. These parts are then
> >+       combined by xor. As hashing function the multiplicative
> >+       method is used.
> 
> On todays machines there is no reason to stick to 8 byte names; when i buy 
> some fonts, i prefer to generate files with names like
> 
> texnansi-<optional alternative>-<optional mm scaling or slant or 
> whatever>-originalname.tfm

I have looked at the large pdffonts.map of TeXLive and 99%
are using short names. So I thought improving performance
to take only the first bytes of the file name. But I see,
I can use all bytes, short file names remain fast then.

> so, i wonder what happens in this case; in general, i think that we should 
> not put any limitations on font names.

It is not a limitation, it is allowed to have the same hash
keys for different values, but it is a performance penalty.
I will change this to scope the full name, see next post.

> 
> >+      Return value:  true if tfm file is inserted,
> >+                     false if the name is already in the hashtable
> 
> i think that it should be possible to overload defaults, i.e. esp if you 
> use those 3000 entry files; user should be able to replace entries (in 
> additional files); so in my opinion later entries must overload previous 
> ones

There does not exist any specification except
for section 5.1 "Map files" of the user manual?
  I have not changed the politics with my patch, only improved
the implemenation. But hashing or some kind of dictionary
can also be used for your approach.

I try to describe the current behaviour:

Sequences of map files are supported.
The line
  "map foo.map"
will replace all previous map files, if there are any,
and start a new sequence.
The entry
  "map +bar.map"
adds the file "bar.map" to the sequence of map files.

If pdfTeX needs the map file informations, it
reads in the latest set of map files. While parsing
the entries, duplicate ones are discarded with
a warning:
  "entry for '<tfm name>' already exists, duplicates ignored"

Currently it is possible to replace all entries by a
new non-empty sequence of map files.

Your with is the possibility to replace only a part
of the entries, eg. by overwriting them by later
given entries?

Then I can imagine the following scenarios:
* Canceling the duplicate warning and replacing it
  by overwriting generally.
* Above only for special map files, eg. given as
    "map !bar.map"
  Then the entries in "bar.map" may overwrite previous ones.

Yours sincerely
  Heiko <oberdiek at uni-freiburg.de>



More information about the pdftex mailing list