# [luatex] Is there a hook that could be used by a pdf postprocessor?

Hans Hagen pragma at wxs.nl
Tue Dec 12 13:50:31 CET 2017

On 12/12/2017 12:42 PM, Knut Petersen wrote:
> Am 11.12.2017 um 21:27 schrieb Hans Hagen:
>>
>>> A "stat lyInLatex.pdf" after luatex finished shows that 40 bytes are
>>> not written at the time of callback execution.
>>
>> Which doesn't mean that there cannot be something written via lua
>> scripts. Also, there is be more to be wrapped up: the log file, file
>> recording, closing of the synctex file, etc.
>>
>
> Yes. In currently stop_run is documented as cited below:
>
>     8.5.3 stop_run
>     function()
>     end
>     This callback replaces the code that prints LuaTEX’s statistics and
>     ‘output written to’ messages.
>
> In 2012 <http://tug.org/pipermail/luatex/2012-May/003647.html> Patrick
> Gundlach discussed on this list if the sentence "The output file is not
> complete at this point and thus you cannot use this callback to post
> process the output file."  should be added to the documentation. The
> subject that stop_run is not executed in draftmode, the subject of a
> postprocessing hook and that stop_run cannot be used for that purpose
> also came up on stackexchange. It might be a good idea to extend the
> documentation, I propose:
>
>     8.5.3 stop_run
>     function()
>     end
>     This callback replaces the code that prints LuaTEX’s statistics and
>     ‘output written to’ messages.
>     It is not executed in draftmode, and the output file is not complete
>     at the point of execution.
>     Thus you cannot use this callback to post process the output file.
>
>> Anyway, managing your workflow can best be done with a wrapper (make
>> was already suggested but many other solutions are possible).

This callback replaces the code that prints \LUATEX's statistics and \quote
{output written to} messages. The engine can still do housekeeping and
therefore
you should not rely on this hook for postprocessing the \PDF\ or log file.

btw, draftmode might go away at some point (in fact it should already
have been dropped a while ago)

> Anything might be done using wrappers, but it's a pitty that some tasks
> currently need a wrapper. A few examples where a pdf_closed callback
> would be usefull:
>
>   * I do not consider it to be good programming practice to remove
>     temporary files that are still opened by the lualatex process, but I
>     would like to remove those files from within the .sty file. [Ok, it
>     is possible to remove files used by \includegraphics in the stop_run
>     callback. They are still open, but they will not be used anymore and
>     the library will probably silently handle the situation. But is this
>     good programming practice?]

Well, i'm not going to discuss good programming practice here as we can
easily extend that to 'what is good style file programming practice' or
even 'what is good typesetting style programming practice.  That said,
if someone in some style would clode the and then another style (as
there can be many active) writes something to the log, we have to deal
with a closed handle etc etc. Not that good either.

>   * I take the pdf produced by *TeX, split it into pages and combine
>     those pages and an audio recording to produce a final x.264 video.
>     That could be done from within the .sty file, but with current
>     luatex the best I can do is to write a script from within the style
>     and typeout a message that tells the user to execute that script.
>     [There is a hackish solution at the end of the mail]

even then somehoe has to key in a command which then could as well be
some script (written in lua using luatex as processor) ... context users
don't know better than that their run is managed by a script (directly
calling luatex for context is possible if you know what you're doing but
it's not even documented and therefore not supported)

i've been dealing with pre and postprocessign my whole texlive and am
pretty sure that some built in hooks for that will never satisfy each
problem

>   * If \includegraphics is used to include a lot of pdfs with embedded
>     fonts the pdfs produced by luatex and other tex engines often
>     produce big files that waste a lot of disk space because of
>     duplicated font data. A solution is to include pdfs without embedded
>     fonts and to post process the output with ghostscript. [There is a
>     hackish solution at the end of the mail]

luatex basically being a typesetting engine and not a reassembling tool
(that it can be used for that is a side effect of the ability to include
pages from pdf files) ... such workflows need managament outside the engine

> So there really would be reasonable use for a pdf_closed callback
>
>     8.5.x pdf_closed
>     function()
>     end
>     This callback is executed immediately after the output pdf has been
>     finished and closed.
>     It might be used topost process the output file.
>
> In 2012 Taco Hoekwater wrote
> <http://tug.org/pipermail/luatex/2012-May/003648.html>:
>
>     At this point I am not sure any more but there definitely is a use case
>     for a callback that runs after the pdf has been closed already. I am not
>     sure whether we should change stop_run or create a new callback.
>
>> Pre- or postprocessing as part of the run is not part of the concept.
>> What works for you can fail for someone else as timing is very
>> application specific.
>
> I don't understand your last sentence about timing, but I hope that you

because we have hooks users can use each hook to write something (from
lua) to the pdf file, even at the last moment before the file is closed
(could be some comment)

> Now here is the hackish solution to the problem that works on my system.
> I really don't like it, but it works here. Obviously it depends on the
> availability of a unix environment and the tools used, no error handling
> is implemented. The unfinished pdf is copied and fixed, mutool is used
> to obtain the correct xref offset. After that tmp.pdf should be
> identical to the final pdf and might be used as desired.
>
>     \def\tmpScriptName{tmpfixpdf.sh}
>     \newwrite\tmpScript
>     \immediate\openout\tmpScript=\tmpScriptName
>     \immediate\write\tmpScript{cp \jobname.pdf tmp.pdf}
>     \immediate\write\tmpScript{echo endstream>> tmp.pdf}
>     \immediate\write\tmpScript{echo endobj>> tmp.pdf}
>     \immediate\write\tmpScript{echo startxref>> tmp.pdf}
>     \immediate\write\tmpScript{\unexpanded{echo \$((mutool show tmp.pdf
>     xref 2>/dev/null
>                                                      | grep :
>                                                      | tail -1
>                                                      | sed -e
>     's/[[:digit:]]*: 0*$$[[:digit:]]*$$
>     [[:print:]]*/\1/' + 1))
>                                                      >> tmp.pdf}}
>     \immediate\write\tmpScript{echo \%\%EOF>> tmp.pdf}
>     \immediate\write\tmpScript{\unexpanded{echo -e "\nHave a look at
>     tmp.pdf"}}
>     \immediate\closeout\tmpScript
>
>     \directlua {
>        os.execute("chmod 700 \tmpScriptName")
>
>        function my_stop_run()
>          os.execute("./\tmpScriptName")
>        end
>
>     }
>
> I think is time to quote Hans Hagen: "so far my experience with tex is
> that there's always a solution" ;-)
indeed, and scripting the run (management) has always been in my
repertoire of solutions ... (and for instance old school metapost
embedding was pretty demanding in that respect)

(fwiw: in most of my use cases multiple runs are needed, sometimes even
related to get the right pdf output at all, and as that is scripted,
removing (say) a log file as part of the process is trivial)

Hans

-----------------------------------------------------------------