[Tugindia] counting words in a TEX document

Kapil Hari Paranjape kapil at imsc.res.in
Sun Jan 25 11:09:03 CET 2009


Hello,

On Sun, 25 Jan 2009, Asha G wrote:
> When I did pdftotext and then did wc on the text I get the following
> message.
> wc: /home/proj/08/cesasha/Nature/JaneliaProposal.txt:56: Invalid or
> incomplete m ultibyte or wide character
>    60  2511 15147 /home/proj/08/cesasha/Nature/JaneliaProposal.txt
> can you please explain what I am doing wrong?

This not really about TeX but since your original question was ...

The default output encoding used by "pdftotext" is "latin1". It is
probably better to use "pdftotext -enc uft8".

As someone on the list said: If you have the TeX input file then it
is probably better to run "untex" on that file rather than convert
the PDF output of pdftex to text.

Regards,

Kapil.
--



More information about the tugindia mailing list