[pdftex] pdf to text?

James W. Haefner jhaefner at biology.usu.edu
Mon Jan 15 10:08:09 CET 2001


Robin Barker wrote:
> 
> > In Acrobat Reader, select the text you want and "Copy" it tot he clipboard
> > and then "Paste" it into your editor.
> >
> > Of course this won't work if the "text" is a bitmapped image,
> > or if it is made using legacy bitmapped fonts.
> > Also, if you are using a package like AE that fakes composite characters
> > by overprinting, you will not get accented characters.
> > But if you are using real text fonts you will get what you want.
> >
> 

Does this work for linux Acroread 4.05 (24 Jan 2000)? It doesn't on  my KDE 1.1 system.  After
"select all", "copy", and "paste" into Nedit (and other text editors), I get only the first page and
no carriage returns.  pdftotext from Xpdf works reasonably well, as suggested by Mr. Beebe.

Also pdftotext does not drop ligatures as mentioned by Robin Barker.

> This is how I do pdf to text but I have one problem.  Some ligatures
> are dealt with properly, but "fl" does not work.
> 
>         \documentclass[a4paper]{article}
>         \begin{document}
>         Can you find a reflection?
>         \end{document}
> 
> If I cut and paste the resulting pdf file, I get:
> 
>         Can you find a re ection?
> 
> Can this be fixed?
> 
> Robin
> 
> --
> Robin Barker                        | Email: Robin.Barker at npl.co.uk
> CMSC, Building 10,                  | Phone: +44 (0) 20 8943 7090
> National Physical Laboratory,       | Fax:   +44 (0) 20 8977 7091
> Teddington, Middlesex, UK. TW11 OLW | WWW:   http://www.npl.co.uk
> _______________________________________________
> pdftex mailing list
> pdftex at tug.org
> http://tug.org/mailman/listinfo/pdftex

-- 
James W. Haefner            
Department of Biology   Email: jhaefner at biology.usu.edu
Utah State University   Voice: 435-797-3553
Logan, UT 84322-5305      Fax: 435-797-1575



More information about the pdftex mailing list