[tex-k] Bizarre coding system in lhr10.tfm

Wed Oct 23 00:57:25 CEST 2019

On Sun, Oct 20, 2019 at 02:52:35PM -0700, Tomas Rokicki wrote:
> There's some information here:
>    https://github.com/rokicki/type3search/
> Ultimately I was less interested in the encoding *names* than
> in the actual PostScript glyph names, and the best source I
> found for that was in the psfonts.map files, and, failing that, in
> the PFB files.  Almost all current METAFONT fonts have
> workalike Type 1 fonts at this point, so I mostly leveraged that
> work.
> I am not aware of any code that depends on, or uses, any sort
> of encoding string in the TFM file in any meaningful way.
> There are still a bunch of additional fonts I need to derive
> reasonable glyph names for.
> I don't seem to have lhr10 in my texlive tree . . .
> -tom

Thanks Tom and Karl!

(And sorry for the slightly slow response.)

This explanation is very helpful; I had found the headerbyte and
special information in Appendix F in The METAFONTbook, but I couldn't
figure out the history of these tfm files, so that now makes sense.

The code in question is mftrace, which uses the embedded coding scheme
to figure out what encoding to use if nothing is specified on the
command line.  So I guess people will have to specify the encoding
file explicitly for these generated fonts.

Tom: lhr10 is generated on-demand by mktexmf.

Best wishes,

   Julian

> On Sun, Oct 20, 2019 at 2:40 PM Karl Berry <karl at freefriends.org> wrote:
> 
>   Hi Julian,
> 
>   (Tom Rokicki: please see end.)
> 
>       How come, then, that the cmr10.tfm in the TeXLive distribution does
>       contain the encoding? 
> 
>   I (or maybe it was Thomas Esser, or someone, can't remember now)
>   generated the basic cm*.tfm and others long, long, ago. Before TeX Live
>   existed, I believe. At that time, I had modes.mf automatically
>   including the codingscheme (so-called "Xerox-world info") in generated
>   tfms, by redefining MF's end primitive.
> 
>   This was the case until 2008, when DEK ran afoul of this, and asked me
>   (rather insistently :) to let "end" mean "end". So I dutifully changed
>   modes.mf so that now "mode_extra_info" has to be called (which no one
>   does) to generate the extra info.
> 
>   However, I did not think it was necessary or desirable to regenerate
>   cm*.tfm merely to remove the extra info, nor did Don request this or
>   mention it as a problem. Better to let sleeping tfms lie, it seemed to me.
> 
>   Therefore, tfms generated before 2008 will have the info, and ones
>   generated after (e.g., dynamically, as with the lh* fonts) will not, by
>   default.
> 
>       Presumably that was produced using something
>       like the code in dummy.mf, but I wonder where that is?
> 
>   Ultimately what generates that stuff is the MF "headerbyte" primitive
>   (for the tfm) and "special" primitive (for the gf). I haven't looked
>   into what the lh* fonts do. You could presumably figure out what it is
>   writing. Maybe it is not writing anything.
> 
>   In general, you cannot rely on the codingscheme (or related information)
>   to be present, or to be useful or correct even if it is. There is,
>   sadly, no general way to determine the encoding of a given tfm presented
>   in a vacuum.
> 
>   If you must know the encoding, and not just process the characters as
>   they come, then you'll have to somehow look up the filename. I am not
>   aware of any global mapping of tfm names to encodings for such lookups,
>   either. Fonts available in PostScript/PDF format can be found in
>   psfonts.map, etc., which often has encoding info, but lh* is mf-only --
>   probably the most significant remaining mf-only font.
> 
>   That said, Tom Rokicki just worked on this whole mess in some depth in
>   order to introduce encodings for Type 3 fonts in dvips. I do not know if
>   he discerned encodings for lh*, though. Tom?
> 
>   Sorry for the long and unsatisfying answer. Good luck. --best, karl.