[texhax] Unused Labels

Peter Flynn peter at silmaril.ie
Sat Feb 23 21:20:08 CET 2019


On 23/02/2019 16:11, Weil, Clifford wrote:
> Is there a facility that will find all “\label” entries that never
> occur in a “\ref” command? That is, is there a way to eliminate all
> labels that are never cited?

Not as such, but you can do it by forcing all non-commented commands to 
the start of a line each, extracting the label and ref commands, 
grabbing their arguments into an array, and then testing the labels for 
not being in a ref:

grep -v '^%' thesis.tex |\
   tr '\\{}' '\012\040\040' |\
   grep -E '^(label|ref)' |\
   awk '/^label/ {++label[$2]} /^ref/ {++ref[$2]} \
       END {for(val in label){if(!(val in ref))print val}}'

This shows me I had rather a lot of over-optimistic labels that I never 
used :-)

This should work on any UNIX or GNU Linux system, including Macs; on a 
Windows system you can either install the utilities with something like 
Cygwin, or rewrite the script into Powershell, which I believe has all 
the features needed. But it's subject to the document not having variant 
syntax like LaTeX code in literal environments, or changing catcode values.

It's a bit easier in XML (which has the identical concept of an ID 
(label) that can only occur once and an IDREF (ref) that can be used 
many times, as in HTML), because the syntax is more rigorous and the 
parser takes care of all the messy stuff like linebreaks and nesting and 
literals; so for a DocBook document, where @linkend is the attribute 
which carries IDREF references, you can write an XPath query statement 
like:

@id[not(//*[@linkend=current()/@id])]

which essentially says "any ID which is not the value of an @linkend 
attribute of any element anywhere else".

Peter


More information about the texhax mailing list