[XeTeX] Latin Modern, from TFM to Unicode
Doug McKenna
doug at mathemaesthetics.com
Tue Jun 11 06:04:42 CEST 2013
All -
I don't know really which TeX-related list this should be posted to, so
I'm starting with this one, as it has dealt with OpenType fonts.
This is a TFM <==> OpenType math font question.
By using the "fonttable" package (under TexLive 2010, LaTeX2e), I'm able
to create a PDF that shows the glyphs for all 128 code points
(characters) for the font file "lmex10.tfm" (Latin Modern Math Extension
10 pt font). The LaTeX code to do this is simple:
\documentclass{article} % or whatever
\usepackage{fonttable}
\begin{document}
\fonttable{lmex10}
\end{document}
The actual glyphs are presumably magically incorporated by pdftex (or
some variant) into the final output file using the associated binary
printer font file "lmex10.pfb", although I'm not really clear about how
all that works. Whatever, that's a separate issue.
For instance, glyph (48 decimal, "30 hex, '060 octal) from "lmex10.tfm"
is the upper piece of a vertically extensible left parenthesis. It
appears that either glyph 66 (decimal) or 67 (decimal) is the repeatable
element making up the vertical extension to build that tall left
parenthesis, though I can't be certain because there's some other
vertical extensions nearby in the table.
I'm on a Mac (Mac OS X 10.7.5), and I've installed a recent version of
the OpenType Latin Modern Math font ("latinmodern-math.otf") in my
"~/Library/Fonts/..." directory, and have looked at it using FontBook.
This font has many thousands of glyphs in it. After choosing the
"Preview > Repertoire" menu choice in FontBook, one can scroll through
all the glyphs and eventually find glyph #2500, identified by Unicode
name "U+2398 LEFT PARENTHESIS UPPER HOOK". And glyph #2499 just before
it in the list appears to be the little vertical piece to use repeatedly
to build a vertically extensible left parenthesis with that upper hook at
the top. That repeatable vertical element's Unicode identification is
"U+239C LEFT PARENTHESIS EXTENSION". Okay, that makes sense.
But then, a few glyphs later, there's another version of the top of an
extensible left parenthesis, Glyph #2506, which is slightly less curved.
But it has no associated Unicode description to it that FontBook shows.
I don't know why, or what the criteria might be that distinguishes when
it might be used as oppposed to the glyph for the official Unicode code
point. The next few subsequent glyphs, up to and including glyph #2511
also have no Unicode names displayed. Then glyph #2512 has an official
Unicode name (U+23A9 LEFT CURLY BRACKET LOWER HOOK). (A separate
question: I wasn't aware that an OpenType font even knew anything about
official Unicode code point names, so how does FontBook know which glyphs
have Unicode names and which ones don't??)
With regards to the OpenType font "latinmodern-math.otf" that I've
installed, I desire to know, for all 128 glyph metrics represented by
"lmex10.tfm", what the official Unicode character code points are for the
glyphs that have those metrics in that TFM file. Is this documented
anywhere, either in text form or binary in a publicly available file as
part of TeX, XeTeX, ...?
I'm actually interested in the answer for all the Latin Modern TFM files,
to the extent that they map to Unicode code points in the OpenType Latin
Modern files, but presumably the answer to this one math extension font
file will prove useful for answering all.
I understand that there's a CMAP table in the OpenType font with a
Unicode encoding sub-table that maps between official Unicode code points
and glyph IDs. It's getting from the 128 "code points" in the TFM files
to the actual Unicode code points that I'm interested in.
All manner of searching for the answer to this simple question has so far
proved elusive.
Thanks in advance for any non-elusive elucidations.
Doug McKenna
More information about the XeTeX
mailing list