[texhax] extracting a plain text file of the final document

Reinhard Kotucha reinhard.kotucha at web.de
Sun Jan 15 04:48:30 CET 2012


On 2012-01-14 at 18:00:58 -0800, jtzzaa11-texhax2 at yahoo.com wrote:

 > Is there a way to obtain a plain text version of the final document
 > processed by latex?. By final version I mean, where all macros,
 > labels, and bibliography entries have been processed and assigned
 > their "final" values.

Does 

   pdftotext -layout <your PDF file>

do what you need?

pdftotext is part of xpdf.  If you don't have it, just install xpdf.

TeX Live provides xpdf (and thus pdftotext) for Windows.  For other
systems I recommend to consult the dedicated package manager.  xpdf is
ubiquitous.

 > Of course, such a plain text version won't contain any floats.

It depends.  It's a matter of fact that a JPEG can't be represented as
plain text, but if the float is a table, it will appear in the output.

Furthermore I assume that pdftotext provides best output if you're in
an environment which supports UTF-8.

Regards,
  Reinhard
 
-- 
----------------------------------------------------------------------------
Reinhard Kotucha                                      Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                              mailto:reinhard.kotucha at web.de
----------------------------------------------------------------------------
Microsoft isn't the answer. Microsoft is the question, and the answer is NO.
----------------------------------------------------------------------------


More information about the texhax mailing list