Structured LOG files

Peter Flynn peter at
Thu May 9 11:46:52 CEST 2019

On 08/05/2019 10:40, Paulo Ney de Souza wrote:
> How do-able would be to make TeX write a better (more structured) LOG file?
> Take the example down below, where one is trying, for example, to parse 
> the LOG file to check which pages have overfulls:
>     ...
>     [45]
>     Overfull \hbox (4.49274pt too wide) in paragraph at lines 1718--1719

The closing square bracket after the page number means the page was 
successfully shipped out to the output file. This means the overfull 
\hbox occurred on p.46.

> It is natural to look for the "[numbers]" after the overfull, but 
> parsing for things that happen in between brackets will lead you to 
> believe the Overfull in line  1718--1719 is at page "1995".
> conjetura completa (ver [][][][][]Tate [1995][][]):

You can adjust your RE to find the close-square-bracket character always 
followed by a space, I think. Or in a script (eg awk, perl), keep a 
count of the actual page number (the edge case, of course, would be when 
your bib ref actually occurs on p.1995 of a very very long thesis :-)

> The "freewheeling" nature of the standard log file -- mixing 
> page-number with date and other info make it almost impossible to 
> reliably extract reasonable information from long log files, or to parse 
> it automatically.

Yes, I suspect DEK's intent was to make it human-parseable first. I 
would hope that the NTS people would use something more tractable, like XML.


More information about the texhax mailing list