[texhax] Unused Labels

Mike Marchywka marchywka at hotmail.com
Sat Feb 23 21:51:39 CET 2019

>From: texhax <texhax-bounces+marchywka=hotmail.com at tug.org> on behalf of Peter Flynn <peter at silmaril.ie>
>Sent: Saturday, February 23, 2019 3:20 PM
>To: texhax at tug.org
>Subject: Re: [texhax] Unused Labels
>On 23/02/2019 16:11, Weil, Clifford wrote:
>> Is there a facility that will find all “\label” entries that never
>> occur in a “\ref” command? That is, is there a way to eliminate all
>> labels that are never cited?
>Not as such, but you can do it by forcing all non-commented commands to
>the start of a line each, extracting the label and ref commands,
>grabbing their arguments into an array, and then testing the labels for
>not being in a ref:
>grep -v '^%' thesis.tex |\
>   tr '\\{}' '\012\040\040' |\
>   grep -E '^(label|ref)' |\
>   awk '/^label/ {++label[$2]} /^ref/ {++ref[$2]} \
>       END {for(val in label){if(!(val in ref))print val}}'
>This shows me I had rather a lot of over-optimistic labels that I never
>used :-)

I'm doing something like this with citations, I have scripts to take a url link to an article
and either get the bibtex or decide it can be obtained later by scanning for "\cite".
In that case it turns the url into a "\cite" which I paste into my tex file. 
Whe it finds a cite reference to a name not in the bib file, it goes and gets
it as the name is a coded url. This fails with \input files however or other things like
bibentry or a newcommand that conceals the operands from grep   IIRC.
 I do a lot of this, like scanning the log file to find a last page number
as none of the alternative worked well. However, they can have subtle failure modes.

I also try to parse the log file for included files so I can make a zip containing all required resources
and AFAICT that does work reliably although curious if people have encountered problems. 

I guess ideally there would be someway for latex to dump whatever it does during
parsing and you oculd grep this to modify your tex file until it hits a fixed point
where nothing changes.  

>This should work on any UNIX or GNU Linux system, including Macs; on a
>Windows system you can either install the utilities with something like
>Cygwin, or rewrite the script into Powershell, which I believe has all
>the features needed. But it's subject to the document not having variant
>syntax like LaTeX code in literal environments, or changing catcode values.
>It's a bit easier in XML (which has the identical concept of an ID
>(label) that can only occur once and an IDREF (ref) that can be used
>many times, as in HTML), because the syntax is more rigorous and the
>parser takes care of all the messy stuff like linebreaks and nesting and
>literals; so for a DocBook document, where @linkend is the attribute
>which carries IDREF references, you can write an XPath query statement
>which essentially says "any ID which is not the value of an @linkend
>attribute of any element anywhere else".
>TeX FAQ: http://www.tex.ac.uk/faq
>Mailing list archives: http://tug.org/pipermail/texhax/
>More links: http://tug.org/begin.html
>Automated subscription management: https://tug.org/mailman/listinfo/texhax
>Human mailing list managers: postmaster at tug.org

More information about the texhax mailing list