Reinhard Kotucha reinhard.kotucha at
Fri May 6 23:58:54 CEST 2022

On 2022-05-06 at 20:19:43 +0100, Philip Taylor (Hellenic Institute) wrote:

 > On 06/05/2022 20:12, Herbert Voss wrote:
 > >
 > > Am 06.05.22 um 16:23 schrieb David Jonah via texhax:
 > >> I want to convert a .pdf document to a LaTeX document. The paper has
 > >> superscripts, an index, and a source document.
 > >
 > > Convert the document first to doc and then to latex. There are several
 > > programs for the first one and some for the second.
 > With Adobe Acrobat DC, one can export a PDF to an MS Word document; the
 > conversion is usually excellent, and if an MS Word to LaTeX converter
 > exists that is of the same quality, then the overall results should be
 > most acceptable.

I suppose that the graphical representation of the converted document
*looks* good but the logical structure of the document gets lost
because it can't be derived from an ordinary PDF file.  A PDF file
only describes the visual representation of a document.

But if you want to edit the converted document the visual
representation is worthless.  You have to use LaTeX macros like
\chapter, \section, \subsection, etc. just to be able to re-generate
the table of contents, for example.

Because you have to edit the converted document anyway I don't see any
benefit from using Adobe Acrobat DC, MS-Word, or any other proprietary
software.  pdftotext does what you need.

On Windows pdftotext is part of TeX Live and on other operating
systems it can be installed by the package manager.


Reinhard Kotucha                            Phone: +49-511-3373112
Marschnerstr. 25
D-30167 Hannover                    mailto:reinhard.kotucha at

More information about the texhax mailing list.