# [texhax] search for text in a pdf file

Randolph J. Herber herber at dcdrjh.fnal.gov
Fri Aug 6 17:42:19 CEST 2004

```Perhaps, just perhaps, print and eye-ball or the electronic equivalent
called optical scanners which supply text outputs.  That is how many
If you can avail permission to make the document acceptable to these
projects (having to do with copyrights), then they just might do the
optical scanning to text for you.

>Date: Fri, 06 Aug 2004 08:55:52 -0400 (EDT)
>From: =?iso-8859-1?Q?BR=D8CK=E2QU=C2NTIFIER=2EORG?= <brock at quantifier.org>
>Subject: Re: [texhax] search for text in a pdf file
>To: Philip TAYLOR <P.Taylor at rhul.ac.uk>
>Cc: texhax at tug.org
>X-Spam-Level:
>	<mailto:texhax-request at tug.org?subject=subscribe>
>	<mailto:texhax-request at tug.org?subject=unsubscribe>

>let's see if i can answer my own queestion.

>I was advised that there are two sorts of .pdf files.  ones made up of
>text and images, and ones made up of just images.  the text/image ones
>should be easy to search through, while the image only ones will require a
>converter of some sort (something stronger than ps2ascii, i've found).

>next i installed acroread which seems to be just a nice little port of

>deb ftp://ftp.nerim.net/debian-marillat/ unstable main

>to my /etc/apt/sources.list

>and poof, there it is.  thanks apt.  it comes iwth a find function right
>on the little tool bar, but since my copy of my pdf seems to be of the
>image only variety (I know its just a scan of a journal) i couldnt find
>text.  sucks.

>so now i'm back where i started, only just a bit smarter.  so what else do
>y'all use to pull text out of a pdf such as this one?

>thanks,

>=f=o=r=t=u=n=e=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>You will get what you deserve.
>=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>= http://www.gmail-is-too-creepy.com - http://hushmail.com -=-=-=-=

>On Fri, 6 Aug 2004, Philip TAYLOR wrote:

>> Maybe use Adobe Acrobat ?  That has inbuilt search facilities.
>> Philip Taylor
>> --------
>> BRØCKâQUÂNTIFIER.ORG wrote:
>>  >
>>  > hello smart folks.  i have a pdf file that is some 350 pages.  and i need
>>  > to search for one phrase.  gv doesnt seem to have a search function, i've
>>  > tried ps2ascii and pstotext after converting the pdf with pdf2ps.  the
>>  > resulting txt files just have a bunch of ^L over and over.  I'm stumped.
>>  > how do i search for a phrase in a pdf file?
>>  >
>>  > thanks for any help from a forum that's not really about pdf.
>>  >
>>  > bobby

>> _______________________________________________
>> TeX FAQ: http://www.tex.ac.uk/faq
>> Mailing list archives: http://tug.org/pipermail/texhax/

>> Automated subscription management: http://tug.org/mailman/listinfo/texhax
>> Human mailing list managers: postmaster at tug.org

>_______________________________________________
>TeX FAQ: http://www.tex.ac.uk/faq
>Mailing list archives: http://tug.org/pipermail/texhax/