[tex4ht] [bug #241] grave accent letter ` (hex 60) changes to left single quotation mark (hex 0xE2 0x80 0x98)
Karl Berry
karl at freefriends.org
Sun Jan 18 00:45:22 CET 2015
btw, I think Nasser had found many errors in .htf files in last two
weeks and and also for many fonts, .htf files are missing.
I don't doubt it. No .htf has been created (in the distribution anyway)
since Eitan died. It would be great to cover some of the new fonts.
my idea is following: we can take property list of a tfm file
I doubt the encoding info in the TFM file is especially reliable even in
the few cases where it's present. (Ditto afm2pl.)
and find postscipt name of the character in corresponding .enc
file. we can get unicode code point for postscript name from
glyphlist.txt and texglyphlist.txt files included in TeX
distribution.
Wow, quite a project.
for these FONTSPECIFIC I have to use
google to find out actually used encoding
For fonts created through the otftotfm process, i.e., nearly everything
that Michael Sharpe and Bob Tennent have done, who have contributed many
of the new fonts (Sharpe did newtx), there should be an opaquely-named
(a bunch of hex chars) .enc file in the font package corresponding to
every tfm. As I understand it.
Anyway, in general, I expect that talking to the package developer or
looking at the sources would be more fruitful than random web searches.
(Not to say it'll be easy, no matter what.)
but sometimes two or more glyphs are used to create character
(mainly accents), so we can't get post script name of such character
even if we knew encoding of referenced glyphs
All I can think of is to have heuristics or a table saying that a
composition of character X + character Y in font F means Unicode point
U. Since it's generally about accents, the combinations should be
finite, and repeated through many different fonts.
Thanks,
K
More information about the tex4ht
mailing list