[pdftex] pdf to text?

Y&Y Support support at YandY.com
Sun Jan 14 08:14:37 CET 2001


At 12:51 2001-01-12 -0700, you wrote:

This is a bit off subject. I want to extract text from a pdf file. I know 
pstotext can do it, but it
loses almost all formatting.  dvi2tty keeps quite a lot of formatting.  Is 
there a functional
equivalent for pdf?

In Acrobat Reader, select the text you want and "Copy" it tot he clipboard
and then "Paste" it into your editor.

Of course this won't work if the "text" is a bitmapped image,
or if it is made using legacy bitmapped fonts.
Also, if you are using a package like AE that fakes composite characters
by overprinting, you will not get accented characters.
But if you are using real text fonts you will get what you want.

--
Y&Y Support mailto:support at YandY.com http://www.YandY.com/unique.htm (PG)




More information about the pdftex mailing list