[pdftex] pdf to text?
James W. Haefner
jhaefner at biology.usu.edu
Mon Jan 15 10:08:09 CET 2001
Robin Barker wrote:
>
> > In Acrobat Reader, select the text you want and "Copy" it tot he clipboard
> > and then "Paste" it into your editor.
> >
> > Of course this won't work if the "text" is a bitmapped image,
> > or if it is made using legacy bitmapped fonts.
> > Also, if you are using a package like AE that fakes composite characters
> > by overprinting, you will not get accented characters.
> > But if you are using real text fonts you will get what you want.
> >
>
Does this work for linux Acroread 4.05 (24 Jan 2000)? It doesn't on my KDE 1.1 system. After
"select all", "copy", and "paste" into Nedit (and other text editors), I get only the first page and
no carriage returns. pdftotext from Xpdf works reasonably well, as suggested by Mr. Beebe.
Also pdftotext does not drop ligatures as mentioned by Robin Barker.
> This is how I do pdf to text but I have one problem. Some ligatures
> are dealt with properly, but "fl" does not work.
>
> \documentclass[a4paper]{article}
> \begin{document}
> Can you find a reflection?
> \end{document}
>
> If I cut and paste the resulting pdf file, I get:
>
> Can you find a re ection?
>
> Can this be fixed?
>
> Robin
>
> --
> Robin Barker | Email: Robin.Barker at npl.co.uk
> CMSC, Building 10, | Phone: +44 (0) 20 8943 7090
> National Physical Laboratory, | Fax: +44 (0) 20 8977 7091
> Teddington, Middlesex, UK. TW11 OLW | WWW: http://www.npl.co.uk
> _______________________________________________
> pdftex mailing list
> pdftex at tug.org
> http://tug.org/mailman/listinfo/pdftex
--
James W. Haefner
Department of Biology Email: jhaefner at biology.usu.edu
Utah State University Voice: 435-797-3553
Logan, UT 84322-5305 Fax: 435-797-1575
More information about the pdftex
mailing list