[OS X TeX] counting words in 2010

Michael Sharpe msharpe at ucsd.edu
Sun Oct 31 23:55:09 CET 2010


On Oct 30, 2010, at 4:12 PM, Peter Dyballa wrote:

> 
> Am 31.10.2010 um 00:43 schrieb Herbert Schulz:
> 
>> You have the detex from TL'11? :-)
> 
> Compiled from the sources...
> 
>> 
>> isn't that what you get too?
> 
> Well, knowing that Einstein was German (really? at least he spoke German!) and wanted that things were made simple, I test this simple way on the command line:
> 
> 	echo "Please test\\footnote{\\textbf{this} footnote},	\\emph{but} quickly!"
> 	echo "Please test\\footnote{\\textbf{this} footnote},	\\emph{but} quickly!" | wc
> 	echo "Please test\\footnote{\\textbf{this} footnote},	\\emph{but} quickly!" | /usr/local/texlive/2010/bin/universal-darwin/detex | wc
> 
> In the last case I can use different versions of detex. The results from the three different command lines are:
> 
> 	Please test\footnote{\textbf{this} footnote},	\emph{but} quickly!
> 	       1       5      66
> 	       1       6      40
> 
> The first line shows that the syntax chosen is OK, the second line counts the run-together words OK (second figure, first one is the number of lines, last one that of the characters of the input line), and the last line is correctly filtered by detex. This command shows how it correctly filters:
> 
> 	echo "Please test\\footnote{\\textbf{this} footnote},	\\emph{but} quickly!" | /usr/local/texlive/2010/bin/universal-darwin/detex
> 	Please test this footnote,	but quickly!
> 
> Again, I do not have the official/ready/finished detex version of TL '10. Maybe this file is defective...
> 
> 
> Well, which "detex" are you actually using? One via a TeXShop engine? Could you add to that engine file:
> 
> 	echo -n "The detex programme soon to be used is certainly this one: " ; which detex
> 
> Thw output will appear in the console window, together with the word count.
> 
> 
> Does the word count come closer to the expected value with a text body of
> 
> 	Hello World\footnote{ footnote} and more
> 
> or
> 
> 	Hello World\footnote{ footnote and more}
> 
> or
> 
> 	Hello World\footnote{ footnote}!
> 

There is a difference between what you are doing and what I think the rest of us are doing. The command line

echo "Please test\\footnote{\\textbf{this} footnote},\\emph{but} quickly." | detex

runs detex in plain TeX mode, and in my case, the result is

Please test this footnote,but quickly.

which, with 6 words, is identical to your result. However, if run in latex mode, as it would be from the point of view of detex, if it were part of  a document containing a\begin{document}, the result is the same as running

echo "Please test\\footnote{\\textbf{this} footnote},\\emph{but} quickly." | detex -l

(the -l forces LaTeX mode), the result is

Please test ,but quickly.

It seems that in LaTeX mode, detex ignores plain tex commands that have a LaTeX version, and if you write

echo "Please test\\begin{footnote}{\\textbf{this} footnote}\\end{footnote},\\emph{but} quickly." | detex -l

the result is

Please testthis footnote,but quickly.

This does of course give an incorrect count because words were run together in LaTeX mode that were not run together in plain tex mode. I would consider thia a bug in detex. (I'm using the x86-64 version that came with TeXLive 2010, but I get exactly the same result with detex from the 2008 distribution.)

Michael






More information about the macostex-archives mailing list