[texhax] extracting a plain text file of the final document
zappathustra at free.fr
Sun Jan 15 10:28:00 CET 2012
Philip TAYLOR <P.Taylor at rhul.ac.uk> a écrit:
> Seems a very useful utility, Reinhard, and one of
> which I was previously unaware, but why does it
> eat all the "Th" (but not "th") groups ?!
Probably because you've used a font with the "Th" ligature and it isn't
recognized. Indeed, with a document in CM, "Th" renders to "Th", while
the same documents in Chaparral renders "Th" as some impossible glyph.
It also renders "ff" as a ligature, unless you include (in the TeX
document with pdfTeX or LuaTeX):
in which case it properly analyzes the ligature.
So something along those lines should be tried with "Th", provided you
find the glyph's name (not "Th" in Chaparal):
Hopefully it works. But still that must be done before compilation, or
perhaps pdftotext as some option signalling such glyph must be mapped to
More information about the texhax