[Tugindia] counting words in a TEX document
Kapil Hari Paranjape
kapil at imsc.res.in
Sun Jan 25 11:09:03 CET 2009
Hello,
On Sun, 25 Jan 2009, Asha G wrote:
> When I did pdftotext and then did wc on the text I get the following
> message.
> wc: /home/proj/08/cesasha/Nature/JaneliaProposal.txt:56: Invalid or
> incomplete m ultibyte or wide character
> 60 2511 15147 /home/proj/08/cesasha/Nature/JaneliaProposal.txt
> can you please explain what I am doing wrong?
This not really about TeX but since your original question was ...
The default output encoding used by "pdftotext" is "latin1". It is
probably better to use "pdftotext -enc uft8".
As someone on the list said: If you have the TeX input file then it
is probably better to run "untex" on that file rather than convert
the PDF output of pdftex to text.
Regards,
Kapil.
--
More information about the tugindia
mailing list